Thank you so much for pushing this forward, @hamelsmu! Impressed with how much you’ve implemented in just a couple weeks.
While I was off, I had some time to think, and here’s what I think would be a good plan to get me back to speed. Please guide me and nudge me to work on the areas that you think make most sense.
1. In order to provide documentation and ease of use for new users, we can refactor
fastpages to be a
If we checkout
nbdev_template and implement everything that’s currently in
_action_files as nbdev-driven code, we’ll have:
- documentation for free, with examples of how it works and how to customize it
- easy entrypoints / CLI scripts for free, i.e.
fastpages_build instead of
- a guarantee that it runs in a clean state, without users having to hack together a
fastpages itself will be automatically tested in the lib notebooks, i.e. unit / integration tests for free.
And the best part, it’s easy to do — maybe an hour or two, and I would be happy to do it — that way I can re-read all the code and get up to speed.
fastpages is under active development, unless I can make this change overnight and yo can review it, we risk having merge conflicts. I’m OK with a risk, and willing to keep up with the development pace.
fastpages becomes an nbdev-based lib, it can’t be a github repo template anymore, we’ll have to create a repo that templates the blog based of
fastpages, which might become a bit an overkill (?), and maybe we can just do everything we want as a part of
After thinking about this, I want to discuss this more with you, @hamelsmu, I’m not sure what’s the best approach, maybe I should spend more time putting together a proposal before writing any code.
2. Local runner with live reload
I haven’t ran the Docker-backed devtools yet, and I think it’s a great idea to use a docker image and not have users install all the deps if they don’t want to!
On a Python-level, we can add a watcher that would watch for updates in Word documents and Notebooks, and rebuild the site preview locally on change.
3. Support running all the notebook-backed posts in a separate CI action on CPU
This will make sure that the notebooks run correctly and the posts are reproducible, with a caveat that the posts have to be able to run on CPUs only. If the
fastpages itself will be engineered with
nbdev, and if the site itself will be a
nbdev library, like I did in my blog (just drop another settings.ini in there), then it’s as easy as
We can document that behavior and guide users on how to make sure their notebooks run on CPU. Specifically, we can provide functions in
fastpages that check whether it’s running in a guthub action, or if a GPU is available, and based on that do something else in the notebook (i.e. don’t run the training loop, just grab saved weights from somewhere).
Note: alternative approach to conditional execution in
nbdev_test_nbs would be to add a command to
# don't run this cell in `nbdev_test_nbs`.
4. Support for GPU-backed CI.
We can also store user credentials to a cloud of choice, say GCP, and make a github action that would spin up a spot GPU instance, and run all the notebooks (or just new / changed notebooks) on that instance! That way, users get full GPU CI on their research articles, guaranteeing reproducibility.
That CI might also report the metrics back, not just
exit(0), i.e. plot a loss function and compare accuracy to a threshold value, provided in the page’s front matter metadata.
I’ll think more about what’s the best area for me to focus on, and implement a thing or two. I won’t move forward with
nbdev-backed lib thing before we discuss it, and I’ll try to write down more details about it.
The thing I’ll start with is I’ll move my blog to the updated
fastpages and play around with it. If I feel like I’ll want to move things around, I’ll document these in PRs.