Clearing GPU Memory - PyTorch

(Ambivalent Torch) #1

I am trying to run the first lesson locally on a machine with GeForce GTX 760 which has 2GB of memory.

After executing this block of code:

arch = resnet34
data = ImageClassifierData.from_paths(PATH, tfms=tfms_from_model(arch, sz))
learn = ConvLearner.pretrained(arch, data, precompute=True), 2)

The GPU memory jumped from 350MB to 700MB, going on with the tutorial and executing more blocks of code which had a training operation in them caused the memory consumption to go larger reaching the maximum of 2GB after which I got a run time error indicating that there isn’t enough memory.

I know for this particular case, this can be avoided by skipping the previous blocks of code which had a training operation in them and just executing the one where I ran out of memory, but how else could this be solved? I tried executing del learn but that doesn’t seem to free any memory.


(Cedric Chee) #2

Try to restart the Jupyter kernel.

Or, we can free this memory without needing to restart the kernel. See the following thread for more info.


(Sam) #3

If you did del some_object

follow it up with torch.cuda.empty_cache()

This will allow the reusable memory to be freed (You may have read that pytorch reuses memory after a del some _object)

This way you can see what memory is truly avalable


(Fernando Marcos Wittmann) #4

Thanks @sam2! torch.cuda.empty_cache() worked for me :slight_smile:


(Octav Cristian Florescu) #5

For me learn.destroy() worked.
Also, i had the CUDA out of memory. Tried to allocate 18.00 MiB (GPU 0; 11.00 GiB total capacity; 8.63 GiB already allocated; 14.32 MiB free; 97.56 MiB cached) issue. Fixed it to work with Jeremy’s bs (lesson3-camvid/2019) by adding .to_fp16() on the learner. Most probably fragmentation related…


(Igor D) #6

hi Octav, could you please show a code line to which you added .to_fp16()?
Also, please explain when would you use learn.destroy() - after saving the model?


(Octav Cristian Florescu) #7

I have used .to_fp16() when creating the learner

learn = unet_learner(data, models.resnet34, ...).to_fp16()

Regarding the .destroy action, I did not save the model since I was not going to use it (given that it did not fit on my GPU’s memory). I have used learn.destroy() and then checked gpu_mem_get_free_no_cache(). Nowadays i am just restarting the whole kernel instead…


(Igor D) #8

Thanks! How do you then deal with the tensor/input mismatch error “RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.cuda.HalfTensor) should be the same”


(Octav Cristian Florescu) #9

Most probably you convert also the input to_fp16(). I do not have that code at myself right now, but this should be it…