Lesson 7 official topic

Discovered some promising info regarding GPU memory leaks following a “CUDA out of memory error.” Still working through it myself, but that could take some time and I didn’t want to lose the links:

[EDIT] These didn’t work out as hoped.

  1. Evaluating “1/0” to force a new exception to release resources held by the previous frame, did not work.

  2. Doing os.environ['FASTAI_TB_CLEAR_FRAMES']="1" at the top of the notebook, didn’t work.

  3. The “Custom Solutions” using @gpu_mem_restore and with gpu_mem_restore_ctx(): didn’t work

In all cases, the behaviour is unchanged, and remains as follows…

train('convnext_large_in22k', 224, epochs=1, accum=1, finetune=False)

CUDA Out Of Memory Error

report_gpu() 

Before GC: GPU:0
process 32095 uses 16263.000 MB GPU memory
Post GC: GPU:0
process 32095 uses 4141.000 MB GPU memory