How to use nbdev for non library ML pipeline?

seem · September 21, 2022, 1:19am

Great question and really good answers above I’m excited about nbdev’s potential for ML pipelines. Some tips that might also be helpful as well…

Repo-relative paths in notebooks and modules

nbdev’s config system has built-in support for this:

>>> from nbdev.config import get_config
>>> get_config().config_path
Path('/Users/seem/code/nbdev/data')  # depends on your repo location

Config.config_path is always an absolute path to your repo root.

To avoid repeating yourself you can also add keys to your settings.ini, for example:

data_dir = data

…then use get_config().path('data_dir') which will return an absolute path to {your_repo}/data

Notebooks can export to scripts (using fastcore.script if you like, or by exporting script code as is).

You can then run them like so:

python -m your_package.train