I notice there’s a lot of waiting around while doing deep learning.
I’m just curious what others do while they’re waiting for their training tasks to run. I often find myself just staring at the accuracy and waiting for the seconds to tick down to zero. What’s the best way to optimize this time?
I go back over the code to put in additional commenting for my understanding, clean something in the house, or as @machinethink said do some quick calisthenics.
I view it almost like a race, if there are 4 minutes of training, move the dryer clothes to the bed, move the wash to the dryer, and get back into my seat. Next time fold out some clothes. Next time put them away.
If you have extra hardware you can design/run another experiment in parallel. You can also analyze results of previous experiments and design a few next things to try. Do error analysis. Of course, it’s easier to find things to do if your experiment runs for 30 minutes or a few hours/days. If the experiment takes just 4 minutes to run then there is not much you can do
To track experiment results I maintain a spreadsheet with experiment name, all hyperparameter values, results (validation loss/accuracy). Also it might be useful to keep some kind of lab journal to keep track of what insights you got after evaluating experiment results and what was the reason for running the experiment (e.g. you got an insight that dropout probability impacts validation loss a lot and you decided to test a few different values for that parameter).
Jakub from neptune.ml here.
If you keep track of metrics, hyperparameters and stuff you can do meta analysis while waiting and figure out what to do next. For example you could:
compare hyperparameters with stuff like skopt.plots. I have written a helper that lets you convert a simple dataframe with metric and parameter values as columns into a scipy.optimize.OptimizeResult object that skopt.plots expect. You can check it out here.
On a side note, I have just added a simple callback that lets you monitor fastai training in Neptune to our neptune-contrib library. I explain how it works in this blog post but basically, with no change to your workflow, you can track code, hyperparameters, metrics and more.
Before you ask, Neptune is now open and free for non-organizations.
Read more about it on the docs page to get a better view.
I am thinking about that question too. I just wonder whether you can some good resolution. I will try to use https://tomato-timer.com/ or some other kind of pomodoro timer. Then, it is good to have a routine or template:
estimate how long would it run?
what kind of activities is relevant and allow you to switch back to the job/coding/learning without much distraction.
Actually, I am waiting to train now. Thus, one of my strategy is to spend time in the forum. I hope this help. Learning together.