How difficult would it be to integrate their models for compatibility with fastai? It looks like even right now Transformer support in Fastai is still very early. Thanks.
Hi I recently integrated Fastai with BERT (using huggingfaceâs pretrained models). I will be writing a medium article and share the notebook asap. Let me know if that helps.
You used the âoldâ pytorch_pretrained_bert library instead of the new pytorch_transformers one. There is a breaking change, where model outputs are now tuples.
They give the instructions to just do the following:
# Let's load our model
model = BertForSequenceClassification.from_pretrained('bert-base-uncased')
# If you used to have this line in pytorch-pretrained-bert:
loss = model(input_ids, labels=labels)
# Now just use this line in pytorch-transformers to extract the loss from the output tuple:
outputs = model(input_ids, labels=labels)
loss = outputs[0]
How would you integrate this change with your existing notebook? Would one have to write a custom basic_train.py to achieve this? I canât figure it out.
Also in addition, from my experiments, Iâve been finding that unfreezing and training the entire model seems to have equal, if not better performance than training the head first then gradually unfreezing the model. Often times it seems like it saves a lot more time to just train the unfrozen model
Itâs weird but it seems that sometimes it gives better results.
I didnât take time to check if the tools given by fastai like Discriminative Learning Rate, Gradual Unfreezing or even Slanted Triangular Learning Rates return better results with the transformer architectures. So itâs good to experiment with these parameters!
I used Gradual Unfreezing to let the possibility to people to use it. Maybe Gradual Unfreezing gives better performances with other model types or other datasetsâŚ
Thank you very much for all your remarks. If you have other questions donât hesitate!
Thanks for your works and article. Really helpful.
I am actually participating in a Kaggle competition (Google QUEST) wherein I would like to use Transformers integration with Fastai. The problem is its an âinternet offâ competition. Any idea how to use your Kaggle Kernel with internet off?