Optimizing memory consumption


I’m trying to optimize the memory consumption of my network and I was wondering whether or not there is a (easy?) way to discard some features maps during the forward pass (selecting the layers to discard) after use and recomputing them on the fly once needed during the backpropagation.

Something like this paper:

They use shared memory for that, and they mention that some features are already supported by pytorch. Has anyone tried that with fastai already? Any insight/examples to show me what I’d have to do to mimic this idea?

It is briefly mentioned in the docs, but its a pytorch feature and you would need to apply it manually: https://docs.fast.ai/tutorial.resources.html#todohelp-wanted

You can also find a few posts about it here in the forum:

Thank you! I was missing the proper keyword for this technique :slight_smile:

I’ll give it a try.

Is there a way to print a model memory consumption taking into account checkpointing, shared memory etc?

Maybe fastai memory profiling still works with checkpoint modification: https://docs.fast.ai/callbacks.mem.html

Thx, I’ll give it a try :slight_smile: