Clearing GPU Memory - PyTorch

I am trying to run the first lesson locally on a machine with GeForce GTX 760 which has 2GB of memory.

After executing this block of code:

arch = resnet34
data = ImageClassifierData.from_paths(PATH, tfms=tfms_from_model(arch, sz))
learn = ConvLearner.pretrained(arch, data, precompute=True), 2)

The GPU memory jumped from 350MB to 700MB, going on with the tutorial and executing more blocks of code which had a training operation in them caused the memory consumption to go larger reaching the maximum of 2GB after which I got a run time error indicating that there isn’t enough memory.

I know for this particular case, this can be avoided by skipping the previous blocks of code which had a training operation in them and just executing the one where I ran out of memory, but how else could this be solved? I tried executing del learn but that doesn’t seem to free any memory.


Try to restart the Jupyter kernel.

Or, we can free this memory without needing to restart the kernel. See the following thread for more info.


If you did del some_object

follow it up with torch.cuda.empty_cache()

This will allow the reusable memory to be freed (You may have read that pytorch reuses memory after a del some _object)

This way you can see what memory is truly avalable


Thanks @sam2! torch.cuda.empty_cache() worked for me :slight_smile:


For me learn.destroy() worked.
Also, i had the CUDA out of memory. Tried to allocate 18.00 MiB (GPU 0; 11.00 GiB total capacity; 8.63 GiB already allocated; 14.32 MiB free; 97.56 MiB cached) issue. Fixed it to work with Jeremy’s bs (lesson3-camvid/2019) by adding .to_fp16() on the learner. Most probably fragmentation related…


hi Octav, could you please show a code line to which you added .to_fp16()?
Also, please explain when would you use learn.destroy() - after saving the model?

I have used .to_fp16() when creating the learner

learn = unet_learner(data, models.resnet34, ...).to_fp16()

Regarding the .destroy action, I did not save the model since I was not going to use it (given that it did not fit on my GPU’s memory). I have used learn.destroy() and then checked gpu_mem_get_free_no_cache(). Nowadays i am just restarting the whole kernel instead…

Thanks! How do you then deal with the tensor/input mismatch error “RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.cuda.HalfTensor) should be the same”

Most probably you convert also the input to_fp16(). I do not have that code at myself right now, but this should be it…

I created an account just to like your post. Thanks! :slight_smile: