OOM cuda while using hooks | Google Colab

shruti_01 · May 23, 2019, 5:01pm

I used hook WITHOUT removing/deleting it post training. Now when I am trying to re-run the notebooks I am getting OOM error as soon as I start training my first epoch.

Added .remove() in hooks now, but still not getting past the training. How should I free the CUDA memory? Tried resetting the runtime - which is not working!

layers_ = flatten_model(learn.model)

class Hook():
def __init__(self, m, f): self.hook = m.register_forward_hook(partial(f, self))
def remove(self): self.hook.remove()
def __del__(self): self.remove()

def append_stats(hook, mod, inp, outp):
if not hasattr(hook,'stats'): hook.stats = ([],[],[])
means,stds, outs = hook.stats
if mod.training:
means.append(outp.data.mean())
stds .append(outp.data.std())
outs.append(outp)
`

gietema · May 25, 2019, 9:12pm

Looks like you’re trying to store all outputs in memory (outs.append(outp)), right?
My guess is that that won’t work on something like Colab (or most other instances), because these outputs are pretty huge and there will be a lot of them depending on your architecture. Just saving the mean and standard deviation should be fine.

shruti_01 · May 26, 2019, 2:46am

yeah, figured that out.