Text classifier doesn't return any validation metrics?

I’m trying to train a model following along with the lesson3-imdb script. I’ve been able to get everything working, to the point where I go to train the classifier - but it doesn’t report any metrics related to the validation set.

data_clas = (TextList.from_csv('.', 'class_descriptions.csv', cols = 'text', vocab = data_lm.vocab )
            .split_by_idx(df['val_idx'])
            .label_from_df(cols = 'job_class')
            .databunch(bs = 128))
learn = text_classifier_learner(data_clas, drop_mult=0.5)
learn.load_encoder('fine_tuned_encoder')
learn.freeze()
learn.fit_one_cycle(4, 5e-2, moms=(0.8,0.7))

Total time: 03:02

epoch train_loss valid_loss accuracy
1 3.755421
2 3.576474
3 3.510875
4 3.017980

I’m new to both deep learning and python, so I’m not sure where to even start diagnosing this… did the validation set not get formed properly, and metric calculations fail silently during training? Did I specify (or not specify) a parameter incorrectly somewhere? Any help or insight is greatly appreciated!

Update:

data_clas.valid_ds

returns

LabelList y: CategoryList (0 items) []... Path: . x: TextList (0 items) []... Path: .

So the issue is somewhere in how I specified the indexes to create the validation set…

Figured it out - I don’t think it liked having the validation indexes sitting separate from the csv with the text and labels. This worked fine.

data_clas = (TextList.from_df(df, cols = 'text', vocab = data_lm.vocab )
            .split_from_df(col = 'val_idx')
            .label_from_df(cols = 'job_class')
            .databunch(bs = 128))

I’ll leave this here for posterity in case anyone else was confused for the same reasons :slight_smile:

2 Likes