How to do predictions on a test dataset with fastai.text? I try with learn.get_preds
but I get wrong results. May be data are shuffled somehow.
First I train the model
dls = TextDataLoaders.from_df(df, text_col='text', label_col='target', seq_len=36)
learn = text_classifier_learner(dls, AWD_LSTM, drop_mult=0.5,
metrics=skm_to_fastai(f1_score), seq_len=36)
learn.fine_tune(4, 1e-2)
then I load and make predictions on a dataset (the same in this example):
dl_test = learn.dls.test_dl(df, with_labels=True)
preds = learn.get_preds(dl=dl_test, with_decoded=True)
df['preds'] = preds[2] # I assume that `preds[1]` are the targets and `preds[2]` are the predicted labels
The result is close to random.
f1_score(df['target'], df['preds'])
0.398
If I apply learn.predict
the results are good, but it is very slow.
df['preds'] = df['text'].map(lambda x: learn.predict(x)[0])
The proper results are also given by:
f1_score(preds[1], preds[2])
0.86
For information, here is the format of pred
:
preds
(tensor([[0.8350, 0.1650],
[0.8271, 0.1729],
[0.7271, 0.2729],
...,
[0.7816, 0.2184],
[0.7872, 0.2128],
[0.7755, 0.2245]]),
TensorCategory([0, 0, 0, ..., 0, 0, 0]),
tensor([0, 0, 0, ..., 0, 0, 0]))