For the target task LM finetuning, is it fair to use all the training and test data of the classification dataset(if it is a small dataset, and you do not have enough separate data for unsupervised training purpose)? As we are using the complete classification dataset for Language modeling, I am not sure if the final classification performance will be from overfitting?
But how would you know the efficacy of the tuned model ? You could fine-tune with the training data and evaluate how it’s doing in test data. Once it’s looking good enough, you can fine-tune one last time with the test data.
Thanks, but I think I may not have clearly conveyed my question. My concern is regarding doing language modeling on the training and test classification data itself, not the training and test data specifically allocated for language modeling from the same dataset(Like Jeremy does with unsup data of the IMDB dataset).
My question is, is it fair to use the classification data for the purpose of language modeling?
Okk, I was wrong, Jeremy was indeed using the whole dataset for IMDB LM fine tuning. So, I guess my concerns were invalid. If someone has any additional insight, please let me know.
Please don’t cross-post questions. It gets confusing! I answered this in the other thread where you asked.
Sorry about that, I thought the in class discussion wasn’t actively followed anymore.