Notebook orchestration

Hi!
First of all, thanks for developing nbdev. It makes our life as data scientists so much easier.
I have numerous notebooks describing different workflows.

Is there a way in nbdev to orchestrate the executing of a predefined sequence of notebooks? Something like make but with proper documentation and the possibility to define dependencies?

Thanks again!

2 Likes

Just fyi. We have been using taskfile so far and are quite happy: https://taskfile.dev/#/
It allows you to define dependencies and run full jupyter notebook workflows from the command line. An alternative would be e.g. Airflow.

2 Likes

Thanks for starting this topic @fabsta. I’m interested as well.

Has anyone had any luck developing production Airflow DAGs using nbdev? I’d love to hear any suggestions or experiences you’ve had.

1 Like

I failed at my first attempt at using nbdev for an Airflow deployment. I’ll try again at some point, because I don’t think my issues were fundamental to Airflow or nbdev. Basically I didn’t find any good reason to keep writing Airflow DAGs in .py files. But I did find a lot of hurdles to doing it in Jupyter notebooks. :smiley:

First, a big shoutout to lib2nbdev, without which I wouldn’t have even attempted converting the 60+ DAGs we have into notebooks. If you haven’t seen it, it’s very simple to use, and I hope that it can drive a bit of adoption of literate programming in the corporate software engineering space.

Currently nbdev seems focused on a specific use case: creating delightful python libraries using Jupyter Notebooks. In that I think it’s succeeded and moved the software engineering world forward. It’s shown what a good literate programming environment can be and it’s been used to produce some awesome libraries as a result. And I think that’s exactly the focus it needs to have to gain adoption and inspire future work.

But I just haven’t been able to easily get it deployed into a mature production system. I can see how to get an nbdev-generated library in to a traditional web application via its requirements.txt. But not how to develop that application in nbdev. I’m sure it’s just a matter of my lack of experience with nbdev and the fact that it’s still early days for this library.

Without going into mundane technical details and Airflow-specific issues, I hit a lot of speed bumps. Suffice it to say, Airflow’s many quirks certainly don’t help. But I think there might be some things to improve about nbdev’s documentation as well. Just wish I knew what :slight_smile: !

My overall impression is that it’s hard to tell from the nbdev documentation what parts of it are fundamental to doing literate programming in notebooks and what parts can be swapped out. It’s gluing a lot of technologies together, so that’s understandable.

But I still have a lot of unresolved questions. Mostly around which parts of settings.ini I can avoid manually. I’m sure it’s possible, but I couldn’t figure out how to hack nbdev to my needs. I was trying to get it to generate the docs and the code, but I needed to use a custom requirements.txt (with --constraints).

Anyways, hope this isn’t taken the wrong way. I’m a big fan of the project and am following from the sidelines.

1 Like