Fastai integration with huggingface pytorch-transformers?

xnet · July 16, 2019, 9:28pm

Has anyone used huggingface pytorch-transformers repo?

They have SOTA language models, including the very recent XLNet. https://github.com/huggingface/pytorch-transformers

How difficult would it be to integrate their models for compatibility with fastai? It looks like even right now Transformer support in Fastai is still very early. Thanks.

abhikjha · July 17, 2019, 10:12pm

Hi I recently integrated Fastai with BERT (using huggingface’s pretrained models). I will be writing a medium article and share the notebook asap. Let me know if that helps.

BR
Abhik

abhikjha · July 19, 2019, 5:26am

Hey, I just posted reference of my medium article on “Share your work here” section.

Here is the link:

faib · July 19, 2019, 8:20am

Thank you for the implementation!

You used the “old” pytorch_pretrained_bert library instead of the new pytorch_transformers one. There is a breaking change, where model outputs are now tuples.
They give the instructions to just do the following:

# Let's load our model
model = BertForSequenceClassification.from_pretrained('bert-base-uncased')

# If you used to have this line in pytorch-pretrained-bert:
loss = model(input_ids, labels=labels)

# Now just use this line in pytorch-transformers to extract the loss from the output tuple:
outputs = model(input_ids, labels=labels)
loss = outputs[0]

How would you integrate this change with your existing notebook? Would one have to write a custom basic_train.py to achieve this? I can’t figure it out.

bfarzin · July 19, 2019, 3:53pm

Your notebook gives me a 404 when I try the link. Can you please re-post the link here?

abhikjha · July 19, 2019, 3:58pm

Oh, I will check the links in article in sometime. Meanwhile, here is the kaggle link:

https://www.kaggle.com/abhikjha/jigsaw-toxicity-bert-with-fastai-and-fastai

abhikjha · July 19, 2019, 4:00pm

Hey Fabian

Yes it so happened that when I finished my project, they changed from pytorch_pretrained_models to pytorch_transformers.

So, I haven’t tried this on newer version. I will see in few days time if it can be done easily on newer version as well

BR
Abhik

harikrishnanrajeev · July 21, 2019, 12:26pm

has anybody tried pytorch-transformers to build a qa model on custom dataset ?

nikhil_no_1 · July 23, 2019, 5:35am

I have started using it for custom datasets for sentiment analysis. Just finished for IMDB movie review. Will soon put the code on github.

nikhil_no_1 · July 25, 2019, 3:56am

Posted this article Running Pytorch-Transformers on Custom Datasets having code and other details.

maroberti · September 2, 2019, 8:14am

Hello Fabian,

I think this article made by David Zhao may help you!

He found an easy solution to make the the new pytorch_transformers library compatible with fastai.

He also made a notebook that you can find here.

faib · September 3, 2019, 12:45pm

Thank you @maroberti! I already stumbled upon this link through the huggingface twitter account but that’s exactly what I was looking for .

bluteaur · November 18, 2019, 6:14pm

Hi,

I’m currently trying to use:

from transformers import *

This works great for BERT:

pretrained_tokenizer = BertTokenizer.from_pretrained(‘bert-base-uncased’)
fastai_vocab = Vocab(list(pretrained_tokenizer.vocab.keys()))

But if I try the same thing with XLNet:

pretrained_tokenizer = XLNetTokenizer.from_pretrained(‘xlnet-base-cased’)
fastai_vocab = Vocab(list(pretrained_tokenizer.vocab.keys()))

I get this error:

AttributeError: ‘XLNetTokenizer’ object has no attribute ‘vocab’

Does anyone know how to get the vocab for XLNet?

Thanks

maroberti · November 28, 2019, 2:56pm

Hello,

Here I made an article that can resolve your problem.

Hope it will help!

maxmatical · November 28, 2019, 4:02pm

I found that compared to this article the results are a lot worse, do you have an idea why this might be the case?

maroberti · November 28, 2019, 4:16pm

Hello Max,

Thank you for the question!
Dev Sharma’s article doesn’t use the same dataset. So it’s not really relevant to compare it with my implementation.

maxmatical · November 28, 2019, 4:22pm

Oh you’re right, my mistake

Also in addition, from my experiments, I’ve been finding that unfreezing and training the entire model seems to have equal, if not better performance than training the head first then gradually unfreezing the model. Often times it seems like it saves a lot more time to just train the unfrozen model

bluteaur · November 28, 2019, 4:50pm

Amazing thank you for this article.

maroberti · November 28, 2019, 4:51pm

Yes, you are right!

It’s weird but it seems that sometimes it gives better results.

I didn’t take time to check if the tools given by fastai like Discriminative Learning Rate, Gradual Unfreezing or even Slanted Triangular Learning Rates return better results with the transformer architectures. So it’s good to experiment with these parameters!

I used Gradual Unfreezing to let the possibility to people to use it. Maybe Gradual Unfreezing gives better performances with other model types or other datasets…

Thank you very much for all your remarks. If you have other questions don’t hesitate!

abhikjha · November 29, 2019, 4:47am

Hi @maroberti

Thanks for your works and article. Really helpful.

I am actually participating in a Kaggle competition (Google QUEST) wherein I would like to use Transformers integration with Fastai. The problem is its an “internet off” competition. Any idea how to use your Kaggle Kernel with internet off?

Thanks
Abhik