GPU Memory spiking quickly until OOM on 1080Ti local box

wgpubs · June 22, 2018, 5:28am

I’m simply calling predict_with_targs(learner.model, learner.data.val_dl) and my GPU memory spikes quickly from essentially nothing to OOM on my 1080Ti 11GB card.

Has anyone else encountered this? And if so, how did you resolve it?

I’m wondering if I need to install pytorch from source as I’m just using the one installed by running conda env update. I was able to run the identical code on my 2GB 970GTX card where I installed pytorch from source. Thoughts?

I also notice that the GPU memory never decreases as I run my notebook. Whether the GPU is being utilized or not, the GPU memory being used according to nvidia-smi stays the same or moves up.

I should note that this is happening when testing the ULMFiT classifier. This is where it’s bombing out:

~/development/dlproj/fastai/lm_rnn.py in <listcomp>(.0)
    131 
    132     def concat(self, arrs):
--> 133         return [torch.cat([l[si] for l in arrs]) for si in range(len(arrs[0]))]
    134 
    135     def forward(self, input):