Memory management in Lecture 7 notebooks?

data-drone · December 17, 2018, 4:42pm

I am having difficulties running the lecture 7 notebooks.

With the superres notebook, do_fit('2a') uses up all my vram (2080 ti / 11GB) and then do_fit('2b') will crash.

with the superres_gan notebook, the same thing happens after learn.save('gan-1c') is that because when we start fit on the same learner again it is cloning the data again? How can we we free up the memory without delete the learner? I tried to use to_fp16() but that causes the show_results command to fail. Previous guidance here: https://forums.fast.ai/t/lesson-3-camvid-half-precision-issue-to-fp16/29805 doesn’t seem to work with the superres notebook

Thanks

jeremy · December 19, 2018, 12:31am

Try a smaller batch size. Or save the model, delete the learner, do gc.collect(), and then create and load the learner. Don’t run any of the show_batch() cells.

bholmer · January 28, 2019, 1:12am

I also have a 2080ti and I can get through all of the lesson7-superres notebook except for the last section “Test”. I’ve been careful about what I load, but still it fails on the:


p,img_hr,b = learn.predict(img)

Wondering if anyone else is having problems with this and knows a fix. Thanks.

End of error message:


~/anaconda3/envs/fastai-course-v3/lib/python3.7/site-packages/fastai/layers.py in forward(self, x)
    136         self.dense=dense
    137 
--> 138     def forward(self, x): return torch.cat([x,x.orig], dim=1) if self.dense else (x+x.orig)
    139 
    140 def res_block(nf, dense:bool=False, norm_type:Optional[NormType]=NormType.Batch, bottle:bool=False, **kwargs):
RuntimeError: CUDA out of memory. Tried to allocate 773.50 MiB (GPU 0; 10.73 GiB total capacity; 8.95 GiB already allocated; 454.38 MiB free; 10.37 MiB cached)

bholmer · January 28, 2019, 2:25am

BTW, using 960x1200 rather than 1280x1600 will avoid the out of memory error. But I would still like to be able to do larger images.

Actually, that brings up another question - is there anyway to adapt this U-net based super resolution model for generating very large images? (well beyond any GPU memory size). I guess you could set the model to run on the CPU and wait a long time

One more aside - I used “i-2” rather than “i-1” in the selection of the VGG layers for the feature and style losses, since I’ve read in several places that you don’t want to loose the information (the negative values) thrown away by the Relu. I do see a slight difference in the result, but I haven’t looked at enough examples to see which I prefer. (although I didn’t do it, I assume I would need to retune the loss weights given that change)

GrahamAcademy · May 26, 2020, 5:46pm

If you have a lot of main memory maybe you can use the CPU to do inference and do larger images?