Machine Learning to Automate Learning

In this context gc.collect() is only useful for cleaning up the mess left after deleting problematic objects that don’t clean up after themselves (learn is one of those). But the key is torch.cuda.empty_cache() if you are relying on nvidia-smi for visual memory monitoring.

I think that’s great. I was also experimenting with batch sizes to take full advantage of available GPU RAM, but ideally, we would have the Learner or a peripheral do that dynamically, which I think you may already be working on in ipyexperiments. :+1:

It should be trivial to automate that - just catch OOM in fit() and reduce bs and try again till it works.

The problem is that bs is not the only hyper-parameter that affects the memory foot print. And the user needs to have control over what they choose to increase/decrease because the outcome will be impacted a lot by an intelligent choice - a process which can not yet be automated fully.

Eventually, libraries like fastai will have machine learning built-into their decision making process, so that they could make such intelligent choices, but we aren’t quite there yet. We are building ML components, but we aren’t using them to make better ML components (yet).

In the future, you will have fastai learn your behavior as you tweak hyper-parameters and re-run the training, and try to anticipate your choices for you. And of course, gather the intelligence from all fastai users so that the community can share that intelligence, and new users don’t have to train their fastai install and can use a pre-trained fastai.

2 Likes