GPU Memory Error With Transformers

denizkavi · May 5, 2020, 10:23am

I’ve been implementing a language model from Huggingface’s transfomers library, following the tutorial on fastai2’s docs.

After trying to build the classification model I get the infamous RuntimeError: CUDA out of memory error. I tried lowering the batch size down to 2, but that still gave the same error. I also tried emptying the cache, but that didn’t work either.

Since I’m using a pre-trained model from another library I don’t think I’d be able to make that smaller. I’m using google colab if that matters.

Edit: I changed from BERT to DistillBERT, with batch_size = 4 and it worked. I guess try considering smaller models instead.

morgan · May 5, 2020, 1:33pm

Did you try a batch size of 1? How much ram did your GPU have? GPT2 is a pretty big model…

denizkavi · May 5, 2020, 2:10pm

Google Colab has K80 GPUs, which users get 12 GBs of RAM with(I think). I was actually using BERT and not GPT2. I moved from regular BERT to DistilBERT which seems to work fine. I could also do bs=4.