How to prevent Cuda out of memory errors in Lesson 4?


(Paulo Eduardo Neves) #1

After a lot of problems to get the Lesson 4 Sentiment Notebook (lesson4-imdb.ipynb) to run in my Windows 10 machine, now I’m stuck in a out of cuda memory error.

When it starts to train the Sentiment classifier, in this cell:

m3.freeze_to(-1)
m3.fit(lrs/2, 1, metrics=[accuracy])
m3.unfreeze()
m3.fit(lrs, 1, metrics=[accuracy], cycle_len=1)

it always fails with the message (complete stack trace at the end of the message):

RuntimeError: cuda runtime error (2) : out of memory at c:\anaconda2\conda-bld\pytorch_1519501749874\work\torch\lib\thc\generic/THCStorage.cu:58

I have a Nvidia GTX 1060 with 6Gb of memory. Usually reducing the batch size allows me to run the models, but this time it wasn’t enough. I changed from the commented value to the below, but still get the error. :

#bs=64; bptt=70
bs=32;bptt=30

I’m also mantaining these parameters set:

em_sz = 200  # size of each embedding vector
nh = 500     # number of hidden activations per layer
nl = 3       # number of layers

If I change these parameters the leaner.fit call fails in the Train section fails also with an out of memory cuda error. it looks like that a lot of info is being cached in notebook. If I change it and try to run just the Sentiment section, I also get an out of memory.

Please, would someone give me some orientations about how should I change my parameters so that I can run it in my 6Gb gpu?

Ops, I’ve just lost my stack trace. I’ll run it again and post it here tomorrow.


(Chris Palmer) #2

Hi @neves - I am interested if you worked through this. I have an even worse GPU that the 1060 (a 650 Ti 2G) and was wondering if I could “upgrade” my limited capacity system to 1060 or 1070. So to find that the NLP is producing out of memory issues on a 6G card is interesting to me.