I’m using nbdev to share a course I’m creating, and I have one or two questions I’d like to pick the community’s brain on:
A number of lessons require downloading large-ish datasets or models and working with them throughout the lesson, including some long-running training cells. This is a bit of a pain when it comes to CI since testing by running all cells is going to take a while. Any suggestions for dealing with this? I don’t want to just skip testing. Maybe detect when we’re in CI and use small dummy datasets during testing? I’d also like to minimize the extra code learners will have to scroll past for this, so hundreds of
if testing: n_iter = 5statements wouldn’t be ideal. Would appreciate any ideas if you’ve dealt with something like this.
‘Open in colab’ would be a great option, but just opening the source notebook means lots of nbdev/quarto directives and so on (along with possibly all the extra code required for speeding up testing referenced above). I’d love to create some sort of pipeline that can pre-process the notebooks to:
- remove directives
- add cells to install requirements
- strip out some other specific cells that are maybe focused on testing/CI/nbdev stuff
Any suggestions for how to start on this would be great. @hamelsmu I know we chatted about this briefly when we last met - I wonder if you’ve had any brain waves about the best way to start implementing something like this?