ULMFit with BERT

maxmatical · October 11, 2019, 1:47am

I’ve looked through a couple of articles going over fine tuning BERT with fastai, along with some forum discussions such as this one. What I’ve noticed is that in almost all the cases, the tutorials on using BERT skips the LM finetuning with BERT and goes straight to the classification stage by using

bert_model = BertForSequenceClassification.from_pretrained(config.bert_model_name, num_labels=6)

And the only guide on finetuning a BERT LM can be found here

Is there a way that I can leverage fastai in the BERT LM finetuning stage? Or is just finetuning on the BERT classifier enough (it looks like that’s what everyone’s doing)?

ilovescience · October 11, 2019, 3:36am

I think you typically retrain the head. Finetuning the whole unfrozen network might be hard as BERT models are large.

ab_ai · November 25, 2019, 1:47pm

There is a lot of focus on training the head in transfer learning, but it is also good practice to unfreeze the whole network and run one (or a few) epochs at a lower learning rate in the fine tuning process. It is not required to unfreeze the whole model in every case to get good results though.