Hi guys, I’m building a text classifier, so according to the course, we need:
build a language model, learn_lm = language_model_learner(data_lm, AWD_LSTM, drop_mult=0.3)
save its encoder, learn_lm.save_encoder(‘encoder’)
build a text classifier, learn_clas = text_classifier_learner(data_clas, AWD_LSTM, drop_mult=0.5)
load language model’s encoder, learn_clas.load_encoder(‘encoder’)
I’m wondering what if I don’t build language model and not use encoder for classifier? which means I only do step 3, will classifier build its own encoder to understand sentences? Or we have to build a language model before build a text classifier?
I think that if you don’t fine-tune a language model on your domain-specific dataset, and go directly to step 3 (building a classifier), then it will by default still use a standard language model (and encoder) that was pretrained on wikipedia. You will likely need to train it for longer and you might not achieve as good quality as with finetuning.
Thank you so much~ I did the experiment on my own dataset, it turns out: yes training time is longer, but the accuracy when NOT building language model and NOT using its encoder is much higher, which makes me not sure whether it is RIGHT to only do step 3 (build classifier) ?