Lesson 1 Sentiment Analysis: CUDA Memory Error despite Batch Size of 16

I’m running the Movie Sentiment Analysis code snippet from lesson 1, but I’ve hit the same CUDA out of memory error multiple times, despite having 6GB VRAM in my GTX 1060 (laptop version) and a batch size of 16. I’m not sure if there is something else I can do to further optimize the memory usage (either reducing the VRAM used or increasing allocated capacity, though the latter is unlikely, as it already seems to be getting close to the limit). The error is here:

RuntimeError: CUDA out of memory. Tried to allocate 92.00 MiB (GPU 0; 5.94 GiB total capacity; 4.67 GiB already allocated; 49.94 MiB free; 5.13 GiB reserved in total by PyTorch)

Debating spooling up a cloud server, since the GPU model (not manufacture date) is around 5 years old, so that might be outdated enough for the tech we’re using here.

GPU Memory usage not only depends on your batch size but also on the size/type of your model, and the size of the input.
To make it work on your PC, you could:

  1. Further reduce the batch size.
  2. Use a smaller model (like DistillBERT instead of BERT).
  3. Shorten the sequence length. Especially for transformers, the memory requirement scales quadratically with the sequence length.
  4. Use mixed-precision, if supported by your hardware (can reduce RAM usage by nearly 50%)
2 Likes

That mixed-precision solution sounds interesting, how would I check if my hardware supports it?

after you build your learner, try: learner = learner.tp_fp16(). Then train as usual. I think it will throw an error if mixed-precision is not supported.

2 Likes

Got it, will try, thanks!