Change the default language model to GPT2 or another transformer based LM?

corey · December 6, 2020, 5:01pm

Going through the NLP part of the new part 1 course and book. Wondering if it would be useful to have transformer models the default or easier to use out of the box with fast ai. From the little I have seen, it seems like they would be faster to fine tune. I think I read somewhere that they are not as good at classification as RNNs or LSTMs, but I really don’t know.

Any drawbacks to using transformers instead of RNNs or LSTMs ?

tensoralex · December 6, 2020, 5:14pm

You can use transformers with fastai https://github.com/fastai/fastai/blob/master/nbs/39_tutorial.transformers.ipynb

corey · December 7, 2020, 1:23am

I did try running that just now and get an out of memory error. Starting at learn.lr_find or if I try to skip that and go to learn.fit_one_cycle. On google colab.

RuntimeError: CUDA out of memory. Tried to allocate 148.00 MiB (GPU 0; 15.75 GiB total capacity; 14.31 GiB already allocated; 28.88 MiB free; 14.39 GiB reserved in total by PyTorch)

I guess high memory usage would be one problem with these large language models ? Maybe distilbert or something similar would work better.