Problem creating text data loader multi category

Pranath · March 27, 2021, 1:55pm

Hi there. I have been trying to create a text data loader using a language model I trained earlier. The labels are multi category, from a data frame with a ‘;’ delimiter. I’m getting these errors and having a lot of trouble trying to figure out why! Can anyone advise me?

My function call:

dls_clas = TextDataLoaders.from_df(df, path=’.’, text_col=‘message’, label_col=‘tags’, label_delim=’;’, valid_pct=0.1, text_vocab=dls_lm.train.vocab, y_block=MultiCategoryBlock())

And the error…

ilovescience · March 27, 2021, 7:27pm

Can you show how your DataFrame looks like?

Pranath · March 28, 2021, 9:50am

HI thanks for your reply. Yes I have attached details of the data frame. What do you think could be the cause of this? The language model for this data set gets created fine?

BresNet · March 28, 2021, 12:15pm

Do you have any nan in the labels?

Pranath · March 28, 2021, 1:18pm

Thanks for your reply. Yes there were nans/null in the labels. I removed these and remade the data loaders but I still got the same errors? so it seems the nans was not the issue?

BresNet · March 28, 2021, 3:53pm

Then there must be some more float variables in the dataframe.
A quick hack to remove all floats from the df could be:
df['tags'] = ['' if isinstance(x, float) else x.split(';') for x in df.tags]