How do you create Training Scripts after Experimentation

jimmiemunyi · September 8, 2022, 8:24am

Following Jeremy’s Response to Karpathy’s tweet a few months ago: https://twitter.com/jeremyphoward/status/1528992716407066624?s=20&t=vdL9aMUgTeHDUK0eQCe9qA,

I got curious about people’s approach to creating training scripts after you are done experimenting in a notebook.

So, during the course, Jeremy teaches in notebooks which are very convenient in experimentations and trying things out like different hyperparameters and ways to improve a model.

When I started working at my current company and also going through open source projects on github, I often notice people having a file like traininig.py which they conveniently just call using:

python training.py --epochs 5

While I do not wish to bring up the hot debate of Notebooks in Production, I was wondering how other people approach this. If I am done experimenting and I have a working combination of a model and hyperparameters and I, for example, want to train the model for a couple of days and leave it running, how should I approach this, do you still use your notebooks and leave them running for all this time or develop a script.

In the response, Jeremy mentions two fastai projects: fastscript and fastgpu which I am currently looking at to see if they will fit my use case. If anyone has an example of how to combine both of these two projects, this would help.

My current approach is to use nbdev + metaflow like this example repo here: GitHub - jimmiemunyi/nbdev-metaflow-example: An example nbdev-metaflow workflow integration but it is very clunky and hacky as I described in this issue here: Custom Directives and Example: Integration with MetaFlow · Issue #1019 · fastai/nbdev · GitHub

drusmanbashir · September 12, 2022, 9:30am

Hi,
I have a very non-notebook approach to this for. I use neovim which essentially allows me to jump around code / enter other libraries at a touch. It comes with lots of plugins and if worked out well, exhibits most features of a fully-featured IDE. Some plugins even let you convert your .py files into notebooks on the fly, if thats your preference.

So I code live in the .py file with an interpreter running in a split window right above. I jump around to execute whatever I want to execute. And at the end of the file i have the
if name == ‘main’:…

so that if want to invoke it from command line (rather than edit it with a REPL on the go), that’s also an option. If this appeals to you, go over to Neovim IDE from Scratch - Introduction (100% lua config) - YouTube series for an overview. I highly recommend this as my productivity is very high, having tried jupyter Pycharm (which i still use for visual debugging).

I would say that knowing Jeremy’s very strong favour for jupyter, perhaps I have not given it a full attention / tried everything. But having all my code split in separate cells and having limited access to code in imported libraries just seems like a deal breaker to me.

jimmiemunyi · September 13, 2022, 1:43pm

Thanks for your input. I will definitely check it out

jimmiemunyi · September 13, 2022, 1:48pm

You can still import stuff in the library that is not included in the __all__ variable,

from mylib import function_not_in_all
from mylib import *