[fastpages] GitHub Pages Blog Using Nbdev

Hi @chansung18, I’m working on setting up tools for easy local preview, will be doing that next week. Right now you have to call bundle exec jekyll serve (See these docs https://jekyllrb.com/docs/) to test locally.

Happy to take PRs for other things or suggestions, if you know of any other tools that may be helpful. Do you often want facebook embedded posts in your blog?

Thanks

2 Likes

Thanks @hamelsmu. I now got my local environment working.

I do not share a facebook post as a part of a blog, but I do share often with people in a chat app. so I thought it could be useful somewhat

@hamelsmu

I am trying to convert notebook to post in my local environment.

As far as I understand, nb2post.py under _action_files directory does this.
However, when I run it python nb2post.py from the root directory, I get a message like below.

converting: _notebooks/test.ipynb
Use `Config.create` to create a `Config` object the first time

And it doesn’t create a post for me.
How could I test nb2post feature locally?

Config.create is likely an ‘old’ reference - seems there is another function in nbdev:

 from nbdev.imports import *
 create_config??
1 Like

@hnrk

You are right, but I just use nbdev library from pip install.
It means it doesn’t matter which version I have.

All I do in fastpages's nb2post.py is calling notebook2html function defined in nbdev.
Underlying procedure should work fine since I got nbdev from PIP.

Hi @chansung18 the best thing to do is to copy settings.ini from here https://raw.githubusercontent.com/fastai/nbdev/master/settings.ini

Into to directory from where you are running nbdev. These are part of the little things I plan to automate next week or make easier for users to develop locally … but since you are ahead of the game I think that’s all you need to get it started!

Please let me know if you run into any more issues

1 Like

Thank you @hamelsmu.

I got it working fine now!
one issue is that settings.ini seems to be placed under the root directory not _action_files for now. (even though there is a separate settings.ini in _action_files directory.

@chansung18 Yeah that was just a hack to get you started, the automated system uses the settings.ini file in _action_files/

Next week I will be preparing a shell script with a docker compose, etc. that will put you into a development environment with one command and everything pre-loaded.

2 Likes

@chansung18 I’ve finished updating the docs, please give it a look when you have a chance: https://github.com/fastai/fastpages

4 Likes

Thanks for your hard work @hamelsmu !
I will check the repo :slight_smile:

Thank you @hamelsmu for the awesome job.
I think there is a link missing link at the 4th point of the Setup Instructions section (see here below).

Setup Instructions

  1. Follow these instructions to create an ssh-deploy key. Make sure you select Allow write access when adding this key to your GitHub account.
  2. Follow these instructions to upload your deploy key as an encrypted secret on GitHub. Make sure you name your key SSH_DEPLOY_KEY .

Hi @farid I just made it bold - there is no link :slight_smile:

However if you are able to go through the instructions and can think of ways to make it better please let me know!

1 Like

Great work @hamelsmu thanks!
There is indeed a typo in the deploy-key link, I made a PR earlier.

One other note is the convert-docker-compose.yml which exposes port 8888 - not sure it’s needed. And if you run a local jupyter notebook it is sort of in the way. As far as I can see, the port is not needed - can it be removed or uncommented?

Otherwise the local dev workflow seems great. :+1:

I was referring to

  1. Follow these instructions to create an ssh-deploy key

I was expecting a link there like the one at the 5th point but I noticed the latter point out to the instructions mentioned in the 4th point if I understand it correctly :slightly_smiling_face:

@farid you are correct. @hnrk just fixed that with a pull request

Hmm re: port 8888 I left it open incase someone wants to start a Jupyter notebook inside the container, but now that you mention it I don’t think its obvious. I’ll make a PR with an explicit command and get your review on it

@hamelsmu

I have one question.

What is the styling guideline?
The post from Jupyter notebook and Word will be reformatted into Markdown format.
Strictly speaking, this is not a pure markdown format since it has HTML tags in it.

Is there a guideline to add more of HTML thing? like how simple it should be?
I believe we could do so many things with it, but it will get much more complicated like including embedded javascript lines of code.

The only guideline I can think of right now is if you want to add HTML you follow this convention:

  1. use the > key: value markdown convention for adding markup that will get turned into {% include <key>.html content=<value> %}, which means you will have to add key.html in _includes/ in the repo.

  2. All of your html should be encapsulated within includes/<key>.html that way it is modular and can be easily edited or temporarily removed.

Is there anything you can think of adding? Some things that would be cool that requires javascript (maybe):

  • Live code hiding/folding
  • Toggle line numbers on code cells
  • Show a copy button when hovering over a code cell

etc.

Thank you so much for pushing this forward, @hamelsmu! Impressed with how much you’ve implemented in just a couple weeks.

While I was off, I had some time to think, and here’s what I think would be a good plan to get me back to speed. Please guide me and nudge me to work on the areas that you think make most sense.

1. In order to provide documentation and ease of use for new users, we can refactor fastpages to be a nbdev project

If we checkout nbdev_template and implement everything that’s currently in _action_files as nbdev-driven code, we’ll have:

  1. documentation for free, with examples of how it works and how to customize it
  2. easy entrypoints / CLI scripts for free, i.e. fastpages_build instead of python nb2post.py
  3. a guarantee that it runs in a clean state, without users having to hack together a settings.ini.
  4. fastpages itself will be automatically tested in the lib notebooks, i.e. unit / integration tests for free.

And the best part, it’s easy to do — maybe an hour or two, and I would be happy to do it — that way I can re-read all the code and get up to speed.

Cons:

  • Since fastpages is under active development, unless I can make this change overnight and yo can review it, we risk having merge conflicts. I’m OK with a risk, and willing to keep up with the development pace.
  • If fastpages becomes an nbdev-based lib, it can’t be a github repo template anymore, we’ll have to create a repo that templates the blog based of fastpages, which might become a bit an overkill (?), and maybe we can just do everything we want as a part of nbdev.

After thinking about this, I want to discuss this more with you, @hamelsmu, I’m not sure what’s the best approach, maybe I should spend more time putting together a proposal before writing any code.

2. Local runner with live reload

I haven’t ran the Docker-backed devtools yet, and I think it’s a great idea to use a docker image and not have users install all the deps if they don’t want to!

On a Python-level, we can add a watcher that would watch for updates in Word documents and Notebooks, and rebuild the site preview locally on change.

3. Support running all the notebook-backed posts in a separate CI action on CPU

This will make sure that the notebooks run correctly and the posts are reproducible, with a caveat that the posts have to be able to run on CPUs only. If the fastpages itself will be engineered with nbdev, and if the site itself will be a nbdev library, like I did in my blog (just drop another settings.ini in there), then it’s as easy as nbdev_test_nbs.

We can document that behavior and guide users on how to make sure their notebooks run on CPU. Specifically, we can provide functions in fastpages that check whether it’s running in a guthub action, or if a GPU is available, and based on that do something else in the notebook (i.e. don’t run the training loop, just grab saved weights from somewhere).

Note: alternative approach to conditional execution in nbdev_test_nbs would be to add a command to nbdev itself:

#skiptest
# don't run this cell in `nbdev_test_nbs`. 

4. Support for GPU-backed CI.

We can also store user credentials to a cloud of choice, say GCP, and make a github action that would spin up a spot GPU instance, and run all the notebooks (or just new / changed notebooks) on that instance! That way, users get full GPU CI on their research articles, guaranteeing reproducibility.

That CI might also report the metrics back, not just exit(0), i.e. plot a loss function and compare accuracy to a threshold value, provided in the page’s front matter metadata.


I’ll think more about what’s the best area for me to focus on, and implement a thing or two. I won’t move forward with nbdev-backed lib thing before we discuss it, and I’ll try to write down more details about it.

The thing I’ll start with is I’ll move my blog to the updated fastpages and play around with it. If I feel like I’ll want to move things around, I’ll document these in PRs.

1 Like

Hi @xnutsive ! Some things that could use some help at the moment:

  1. Review this pr
  2. I really like your idea #2 for Local runner with live reload

I’m not sure about the idea of running actual code in the notebook inside Actions themselves (#3 and #4) , because Actions is a very resource-constrained envirnonment. The way to normally accomplish this is to use something like Kubernetes where you can define your infrastructure as code and deploy the notebook to get run there. However, I feel like that would be overkill and too complicated for most people ( I can barely figure out how to do all these things to be honest and it is a pain ).

As for making this an nbdev based project (#1) where we develop everything in nbdev, I am not sure about that either as there isn’t a great deal of python code (it is only used to drive the GitHub Actions), and it is difficult to test these Action scripts interactively in a notebook as they must be in Docker containers to emulate the GitHub Actions environment correctly (the idea is if you are going to test locally it should mirror the Actions environment).

That being said, I could be missing something or sometimes I can be dense. Please let me know if I have overlooked something. Also, given your involvement in this project, I would be happy to meet on hangouts or zoom (please DM me either here, twitter, etc) and we can talk real time as I feel like you have lots of good ideas in your head.

Thank you so much for these ideas, really appreciate it.