Change the default language model to GPT2 or another transformer based LM?

Going through the NLP part of the new part 1 course and book. Wondering if it would be useful to have transformer models the default or easier to use out of the box with fast ai. From the little I have seen, it seems like they would be faster to fine tune. I think I read somewhere that they are not as good at classification as RNNs or LSTMs, but I really don’t know.

Any drawbacks to using transformers instead of RNNs or LSTMs ?

You can use transformers with fastai

I did try running that just now and get an out of memory error. Starting at learn.lr_find or if I try to skip that and go to learn.fit_one_cycle. On google colab.

RuntimeError: CUDA out of memory. Tried to allocate 148.00 MiB (GPU 0; 15.75 GiB total capacity; 14.31 GiB already allocated; 28.88 MiB free; 14.39 GiB reserved in total by PyTorch)

I guess high memory usage would be one problem with these large language models ? Maybe distilbert or something similar would work better.