CUDA Out of Memory Error

dhiman63 · February 14, 2018, 10:46am

Hi All,

I have just started setting up my Macbook Pro for the fast.ai course. Had to spent almost 2 days! No complaining there as I got to learn a lot of things while doing so.

But as soon as I started executing the snippets from the lesson1.pynb I get the following error:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-18-e6c87b20ce86> in <module>()
      1 arch=resnet34
      2 data = ImageClassifierData.from_paths(PATH, tfms=tfms_from_model(arch, sz))
----> 3 learn = ConvLearner.pretrained(arch, data, precompute=True)
      4 learn.fit(0.01, 2)

~/Anaconda/anaconda3/lib/python3.6/site-packages/fastai/conv_learner.py in pretrained(cls, f, data, ps, xtra_fc, xtra_cut, **kwargs)
     92     @classmethod
     93     def pretrained(cls, f, data, ps=None, xtra_fc=None, xtra_cut=0, **kwargs):
---> 94         models = ConvnetBuilder(f, data.c, data.is_multi, data.is_reg, ps=ps, xtra_fc=xtra_fc, xtra_cut=xtra_cut)
     95         return cls(data, models, **kwargs)
     96 

~/Anaconda/anaconda3/lib/python3.6/site-packages/fastai/conv_learner.py in __init__(self, f, c, is_multi, is_reg, ps, xtra_fc, xtra_cut)
     47         self.n_fc = len(fc_layers)
     48         self.fc_model = to_gpu(nn.Sequential(*fc_layers))
---> 49         apply_init(self.fc_model, kaiming_normal)
     50         self.model = to_gpu(nn.Sequential(*(layers+fc_layers)))
     51 

~/Anaconda/anaconda3/lib/python3.6/site-packages/fastai/initializers.py in apply_init(m, init_fn)
      8 
      9 def apply_init(m, init_fn):
---> 10     m.apply(lambda x: cond_init(x, init_fn))
     11 
     12 

~/Anaconda/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py in apply(self, fn)
    196         """
    197         for module in self.children():
--> 198             module.apply(fn)
    199         fn(self)
    200         return self

~/Anaconda/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py in apply(self, fn)
    197         for module in self.children():
    198             module.apply(fn)
--> 199         fn(self)
    200         return self
    201 

~/Anaconda/anaconda3/lib/python3.6/site-packages/fastai/initializers.py in <lambda>(x)
      8 
      9 def apply_init(m, init_fn):
---> 10     m.apply(lambda x: cond_init(x, init_fn))
     11 
     12 

~/Anaconda/anaconda3/lib/python3.6/site-packages/fastai/initializers.py in cond_init(m, init_fn)
      5     if not isinstance(m, (nn.BatchNorm1d,nn.BatchNorm2d,nn.BatchNorm3d)):
      6         if hasattr(m, 'weight'): init_fn(m.weight)
----> 7         if hasattr(m, 'bias'): m.bias.data.fill_(0.)
      8 
      9 def apply_init(m, init_fn):

RuntimeError: cuda runtime error (2) : out of memory at /Users/dhiman63/pytorch/aten/src/THC/generic/THCTensorMath.cu:15

My Macbook Pro has 2 GB of Graphics Memory (NVIDIA GeForce GT 750M).

Any help is appreciated!

Thanks,
Dhiman

Matthew · February 14, 2018, 11:00am

2 GB is fairly limited. Try restarting your kernel and using a smaller batch size (a somewhat more advanced topic). You can do this by replacing:
data = ImageClassifierData.from_paths(PATH, tfms=tfms_from_model(arch, sz))
with:
data = ImageClassifierData.from_paths(PATH, bs=32, tfms=tfms_from_model(arch, sz))
The default batch size (“bs”) is 64. If that doesn’t work, keep lowering the batch size.

dhiman63 · February 14, 2018, 11:47am

Hi Matthew,

Thanks a lot for the suggestion! I did try it out but instead of bs=32 I went with bs=8 since my GPU is kind of one of the older versions. It took me more than 10 minutes to run it but it completed till the end!

Although, I have got this message from the execution (kind of expected). It would be great if you share some light on the same.

  0%|          | 0/2875 [00:00<?, ?it/s]
/Users/dhiman63/Anaconda/anaconda3/lib/python3.6/site-packages/torch/cuda/__init__.py:116: UserWarning: 
    Found GPU0 GeForce GT 750M which is of cuda capability 3.0.
    PyTorch no longer supports this GPU because it is too old.
    
  warnings.warn(old_gpu_warn % (d, name, major, capability[1]))
/Users/dhiman63/Anaconda/anaconda3/lib/python3.6/site-packages/torch/_utils.py:86: UserWarning: 'async' is deprecated; use 'non_blocking'
  warnings.warn("'async' is deprecated; use 'non_blocking'")
/Users/dhiman63/Anaconda/anaconda3/lib/python3.6/site-packages/fastai/core.py:21: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  def VV_(x): return to_gpu(x, async=True) if isinstance(x, Variable) else Variable(to_gpu(x, async=True), volatile=True)
100%|██████████| 2875/2875 [15:53<00:00,  3.02it/s]
100%|██████████| 250/250 [01:20<00:00,  3.09it/s]
Epoch
100% 2/2 [01:02<00:00, 31.37s/it]
[0.      0.18559 0.03029 0.9875 ]                               
[1.      0.13927 0.03034 0.9885 ]

Thanks once again!

ecdrid · February 14, 2018, 12:12pm

Nice…
You can ignore those warnings…

shep2010 · August 20, 2018, 2:47am

Can you use “bs=” or something similar for the “learn.fit(lr, 3, cycle_len=1, cycle_mult=2)” ? i am getting the same error on the learn.fit, but not earlier. This is on a home built GPU machine as well, 2gb of GPU memory. I know the Paperspace GPU’s we are using have 8gb, so perhaps i am at my limit?

RuntimeError: cuda runtime error (2) : out of memory at /opt/conda/conda-bld/pytorch_1518244421288/work/torch/lib/THC/generic/THCStorage.cu:58

Shep

raaj · September 9, 2018, 5:31pm

@shep2010 I am having the same problem. How did you solve it?

shep2010 · September 9, 2018, 7:11pm

Never solved it. I moved back to Paperspace.

TheShadow29 · September 9, 2018, 9:46pm

Cuda out of memory error occurs because your model is larger than the gpu memory. Big networks like resnet won’t fit into 2gb memory. The bs= option is in the process of making the dataloader. In the above case it is during creating data= using ImageClassifierData.

Best is to use google colab if you need access to free gpu.

jburkhart · November 28, 2018, 6:45pm

I tried using a 2 GB nividia card for lesson 1. I got most of the notebook to run by playing with batch size, clearing cuda cache and other memory management. Reading other forums it seems GPU memory management is a pretty big challenge with pyTorch. I decided my time is better spent using a GPU card with more memory. I have little doubt I will be revisiting GPU memory management as I work with more complex models and large data sets.

arijitdeb · April 23, 2019, 1:46pm

Hi,
I am getting the error while executing at step learn.fit(lr, 3, cycle_len=1) for the Deep learning lesson lesson1-rxt50. I am using a setup with ‘GeForce GTX 1070 Ti’ and also have 32GB RAM. Pytorch version is 0.3.1.post2. Please suggest.

RuntimeError: cuda runtime error (2) : out of memory at /opt/conda/conda-bld/pytorch_1518244421288/work/torch/lib/THC/generic/THCStorage.cu:58

Thanks,
Arijit

arijitdeb · April 23, 2019, 2:41pm

RuntimeError: cuda runtime error (2) : out of memory at /opt/conda/conda-bld/pytorch_1518244421288/work/torch/lib/THC/generic/THCStorage.cu:58

It seems the memory usage is also going high at this stage consuming close to 8GB if I do nvidia-smi.
Can you please suggest how to fine tune this.

±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 936 G /usr/lib/xorg/Xorg 266MiB |
| 0 2137 G budgie-wm 126MiB |
| 0 2942 G …gpu-preferences=KAAAAAAAAACAAAAAAQAAAAA 70MiB |
| 0 5150 C /usr/lib/libreoffice/program/soffice.bin 105MiB |
| 0 7232 C …e/user/anaconda3/envs/fastai/bin/python 7519MiB |