I couldn’t verify that show_batch includes test data. If I create a language model databunch (data_lm) from csv and a language model learner like this:
data_lm = (TextList.from_csv(path, 'train_valid.csv', cols='text')
.random_split_by_pct(valid_pct=0.2, seed=None)
#.split_from_df(col='is_valid')
.label_for_lm()
.add_test(df_test['text']) # check
.databunch(bs=32))
learn = language_model_learner(data_lm, pretrained_model=URLs.WT103_1)
does the text cols of the test set (df_test) participate in fine-tuning (learn.fit_one_cycle())?
Update: found I had to pass ds_type (train by default), so I guess test set takes part in model fine-tuning.