I have a simple dataframe with text in column 1 and classifiers in column 2 separated by a ‘;’
eg:
my_text tag_list
I went to the shops and I couldnt find the dog I was looking for G421;Z272
I am really not a fan. He looked odd to me. Z241
What's the answer then? H221;H206
I load this up in to a dataloader as follows:
dls_clas = TextDataLoaders.from_df(df=df_c_OPCS4, path='/content/gdrive/MyDrive/Colab_data',
valid_pct=0.2,
text_vocab=dls_lm.vocab,
text_col='my_text',
label_col='tag_list',
label_delim=";",
y_names='tag_list',
y_block=MultiCategoryBlock())
however the output I get when I show_batch
gives me two columns called ‘text’ and ‘None’. I think I should be expecting the column name to be ‘category’.
This means that downstream functions don’t work eg. when I run the text classifier and then learn.show_results()
I get an error:
/usr/local/lib/python3.7/dist-packages/fastai/torch_core.py in show_title(o, ax, ctx, label, color, **kwargs)
462 ax.set_title(o, color=color)
463 elif isinstance(ax, pd.Series):
--> 464 while label in ax: label += '_'
465 ax = ax.append(pd.Series({label: o}))
466 return ax
TypeError: unsupported operand type(s) for +=: 'NoneType' and 'str'
I think the error is because there is no label to append to the Series because the category name is missing.
How can I ensure that the category column is labelled when I set up the MultiCategoryBlock in the dataloader?