How to use nbdev for non library ML pipeline?

nbdev seems great for package development but 90% time in production we need ML pipeline that do not explicitly involve developing new packages. For example I might want to generate train.py (or predict.py) from my notebook for production. In this case since the level of notebook is different from that of scripts generated the paths will not work, for example if I have a "data’ folder which contains data the relative path to access data will be different in notebook as compared to script ? Anyone having any thoughts on this ?

1 Like

Hi @saneshashank,

My suggestion is to still use nbdev to build reusable routines as packages.
You can have your “tests” inside the notebook where your routines reside.
You probably wouldn’t to hardcode your paths into your routines anyway.

Then using fastcore’s script utility, you can turn these routines into commands
with arguments and help messages.

See this link for more info.

Your ML pipeline can then just be bash scripts in conjunction with config files or
whatever you want.

As suggested, you can do editable installs so you don’t have to keep reinstalling the
package as you update your notebook (just run nbdev_build_lib to sync them).

HTH.

Best regards,
Butch

3 Likes