Great question and really good answers above I’m excited about nbdev’s potential for ML pipelines. Some tips that might also be helpful as well…
Repo-relative paths in notebooks and modules
nbdev’s config system has built-in support for this:
>>> from nbdev.config import get_config
>>> get_config().config_path
Path('/Users/seem/code/nbdev/data') # depends on your repo location
Config.config_path
is always an absolute path to your repo root.
To avoid repeating yourself you can also add keys to your settings.ini
, for example:
data_dir = data
…then use get_config().path('data_dir')
which will return an absolute path to {your_repo}/data
Exporting scripts & running them
Notebooks can export to scripts (using fastcore.script
if you like, or by exporting script code as is).
You can then run them like so:
python -m your_package.train