RuntimeError: cuda runtime error (2) : out of memory at /opt/conda/conda-bld/pytorch_1512387374934/work/torch/lib/THC/generic/THCStorage.cu:58
when training the model with learner.fit(3e-3, 4, wds=1e-6, cycle_len=1, cycle_mult=2).
However, the graphics card is rendering X as well as training the LanguageModel so that means some of the memory (approx 350-450MiB) is taken up with that too. Having tracked memory usage through nvidia-smi, there is not much headroom on the 8GB of memory.
I think part of the problem might also be that, as Jeremy discusses in the video, the bptt = 70 parameter is not 100% fixed so the batch size can vary somewhat.
I’ll try using bptt of 65 and seeing if that improves things …