NBDev Testing workflow: Inline vs external tests?

Taytay · April 25, 2023, 7:49am

I’m pretty new to all of this (Python, Notebooks, data science, etc), but not new to programming. I am excited about nbdev, and have a question about the recommended testing workflow.

In Jeremy’s YouTube video, “I Like Notebooks”, he recommends writing notebooks in sucjh a way that you can frequently restart the kernel and hit “Run All”.

The NBDev documentation makes the point that writing tests in the same notebook as the code is valuable. Generally I agree. However, I frequently find myself writing some pretty heavy tests that rely upon external services. So if I were to hit “Run All” in that notebook, those long-running cells would also run, but that’s not what I want. I could tag these cells as “#slow” and that would affect whether nbtest runs them, and that’s fine. But that wouldn’t help me prevent them from running when I click run all at the top of the notebook.

So these two guidelines make me think that what I really want is the ability to skip certain cells by default when hitting run all in Jupyter, and tag them such that they only run at nbdev_test time.

But in searching for guidance about how to exclude notebook cells from running by default, many folks advise against this for various reasons, so I think I’m swimming upstream with that idea.

This all leads me to think that I should be moving these longer-running tests into a notebook outside of the notebook where the code is, and then I start to think I’m losing one of the core advantages of nbdev.

I think I’m overcomplicating this. What are you clever folks doing?

michaelaye · April 25, 2023, 8:03am

I don’t think you lose much when you put heavy tests into another notebook. Not only is it automatically recognized by the nbdev_xxx commands, you can also mark a whole notebook as slow and therefore easily exclude it from nbdev_test. So, that’s what I would do. But I’m only semi-experienced with nbdev.

Taytay · April 25, 2023, 8:03am

Oh cool - that’s really helpful. I appreciate the quick response!

seem · April 27, 2023, 11:29am

+1, this is how I would do it too. I personally have a very strong preference to keep notebooks with the main logic very fast. Also, some of the more integration/end-to-end tests can be quite long which tends to disrupt the narrative in the main notebooks

Taytay · May 29, 2023, 9:13am

Thank you seem! I appreciate the validation.

JaumeAmoresDS · October 5, 2023, 5:23am

Hi guys,

As a potential alternative, we may integrate magic functions with nbdev to selectively activate / deactivate running certain tests in the notebook. Even the not-so-slow ones can be troublesome when there are many of them in the notebook, and we want to quickly experiment with an update by running the whole thing.

To this end, I created a proof-of-concept that explores this idea. It is called nbmodular (please check it out, it is in pypi). Currently the python and test code are exported to separate files without the help of nbdev. I still need to integrate it with nbdev to enable syncing between the created modules and the original notebook. However, this shouldn’t take too much effort, gien how easy it is to use nbdev preprocessors…

You will see that nbmodular is actually doing many other things, and that its objectives are somewhat different from what typical nbdev users have in mind… please check the objectives and features described in the README to check what it can do!

Any feedback would be really appreciated