I put together a notebook to finetune the BERT, ALBERT, DistilBERT and RoBERTa transformer models from HuggingFace for text classification using fastai-v2.
FastHugsTokenizer: A tokenizer wrapper than can be used with fastai-v2’s tokenizer.
FastHugsModel: A model wrapper over the HF models, more or less the same to the wrapper’s from HF fastai-v1 articles mentioned below
Vocab: A function to extract the vocab depending on the pre-trained transformer (HF hasn’t standardised this processes ).
Padding: Padding settings for the padding token index and on whether the transformer prefers left or right padding
Vocab for Albert-base-v2 : .json for Albert-base-v2’s vocab, otherwise this has to be extracted from a SentencePiece model file, which isn’t fun
Model Splitters: Functions to split the classification head from the model backbone in line with fastai-v2’s new definition of Learner
This notebooks heavily borrows from this notebook , which in turn is based off of this tutorial and accompanying article. Huge thanks to Melissa Rajaram and Maximilien Roberti for these great resources, if you’re not familiar with the HuggingFace library please given them a read first as they are quite comprehensive.
Thanks for creating this notebook, I find it quite helpful!
Based on the course, with an AWD_LSTM model you can achieve around 94% accuracy (albeit with a bit more fine tuning) on the IMDB sentiment analysis task. Any idea why we only get 86% accuracy (in the fasthugs_demo.ipynb notebook)? I thought transformers were supposed to be state of the art so I’m a bit surprised.
Instead of using a Classifier model, I want to import a BERT LM to fastai2 , and do some text predicitons.
The tokenization works fine, and the following command also. BTW this is the only Learner that I succeeded to make it work.
learn = LMLearner(dls, fasthugs_model_lm, loss_func=CrossEntropyLossFlat(), splitter=splitter,cbs=cbs)
However , I get a weird error message in the loss, input/target sizes. I am still struggling with the code, but I will appreciate if somebody has already succeeded.
My goal is to use a trained roberta/huggingface language model for masked language prediction / next word prediction in a fastai environment.
I want to import a trained model in a large corpus, use as it is in fastai2, also as language mode. And make predictions using learn.predict.
In the last step,I would like to modify the model head to make classification of other task. Since I am learning fastai, I assume it would be easier to have all the customization done in fastai environment.
Hi, I am planning to start a project on this with text generation. I am just asking that is it easy to generate text with many pre-trained model like GPT-2 or wiki-text?
Or do we have an example notebook that is a text generation of it?
The same way you would deploy any other fastai model I would say learn.export() would be better as I think that would export the tokenizer for you too.