I’ve seen various blog posts and a few posts on this forum about this topic but none have answered my question. I am doing multilabel classification on tabular data.
Here is what I have.
train is the training data (800 columns) and
train_targets are the labels (206 columns, all values are either 0 or 1):
cat_names = ['cat1', 'cat2', 'cat3'] cont_names = [x for x in train.columns if x not in cat_names] train_label_col =  for i, row in enumerate(train_labels.itertuples()): vals = [','.join(str(ele).split()) for ele in row[1:]] train_label_col.append(' '.join(vals)) train['label'] = train_label_col procs = [Categorify, FillMissing, Normalize] splits = RandomSplitter()(range_of(train)) to = TabularPandas(train, procs, cat_names, cont_names, y_names="label", y_block=MultiCategoryBlock(), splits=splits)
All of the above works fine, but when I run
dls = to.dataloaders(bs=1024)
I get the “Could not do one pass in your dataloader, there is something wrong in it” warning, and when I run
dls.show_batch(3) it throws
TypeError: can't convert np.ndarray of type numpy.object_. The only supported types are: float64, float32, float16, complex64, complex128, int64, int32, int16, int8, uint8, and bool.
When I run
learn = tabular_learner(dls, y_range=(0,1), layers=[500, 250], n_out=1, loss_func=F.binary_cross_entropy) it works, but
learn.fit_one_cycle(5, 1e-2) throws the same error as above.
Any help is greatly appreciated