[fastpages] GitHub Pages Blog Using Nbdev

chansung18 · February 12, 2020, 12:30am

I have one question.

What is the styling guideline?
The post from Jupyter notebook and Word will be reformatted into Markdown format.
Strictly speaking, this is not a pure markdown format since it has HTML tags in it.

Is there a guideline to add more of HTML thing? like how simple it should be?
I believe we could do so many things with it, but it will get much more complicated like including embedded javascript lines of code.

hamelsmu · February 12, 2020, 8:55am

The only guideline I can think of right now is if you want to add HTML you follow this convention:

use the > key: value markdown convention for adding markup that will get turned into {% include <key>.html content=<value> %}, which means you will have to add key.html in _includes/ in the repo.
All of your html should be encapsulated within includes/<key>.html that way it is modular and can be easily edited or temporarily removed.

Is there anything you can think of adding? Some things that would be cool that requires javascript (maybe):

Live code hiding/folding
Toggle line numbers on code cells
Show a copy button when hovering over a code cell

etc.

xnutsive · February 13, 2020, 10:06pm

Thank you so much for pushing this forward, @hamelsmu! Impressed with how much you’ve implemented in just a couple weeks.

While I was off, I had some time to think, and here’s what I think would be a good plan to get me back to speed. Please guide me and nudge me to work on the areas that you think make most sense.

1. In order to provide documentation and ease of use for new users, we can refactor `fastpages` to be a `nbdev` project

If we checkout nbdev_template and implement everything that’s currently in _action_files as nbdev-driven code, we’ll have:

documentation for free, with examples of how it works and how to customize it
easy entrypoints / CLI scripts for free, i.e. fastpages_build instead of python nb2post.py
a guarantee that it runs in a clean state, without users having to hack together a settings.ini.
fastpages itself will be automatically tested in the lib notebooks, i.e. unit / integration tests for free.

And the best part, it’s easy to do — maybe an hour or two, and I would be happy to do it — that way I can re-read all the code and get up to speed.

Cons:

Since fastpages is under active development, unless I can make this change overnight and yo can review it, we risk having merge conflicts. I’m OK with a risk, and willing to keep up with the development pace.
If fastpages becomes an nbdev-based lib, it can’t be a github repo template anymore, we’ll have to create a repo that templates the blog based of fastpages, which might become a bit an overkill (?), and maybe we can just do everything we want as a part of nbdev.

After thinking about this, I want to discuss this more with you, @hamelsmu, I’m not sure what’s the best approach, maybe I should spend more time putting together a proposal before writing any code.

2. Local runner with live reload

I haven’t ran the Docker-backed devtools yet, and I think it’s a great idea to use a docker image and not have users install all the deps if they don’t want to!

On a Python-level, we can add a watcher that would watch for updates in Word documents and Notebooks, and rebuild the site preview locally on change.

3. Support running all the notebook-backed posts in a separate CI action on CPU

This will make sure that the notebooks run correctly and the posts are reproducible, with a caveat that the posts have to be able to run on CPUs only. If the fastpages itself will be engineered with nbdev, and if the site itself will be a nbdev library, like I did in my blog (just drop another settings.ini in there), then it’s as easy as nbdev_test_nbs.

We can document that behavior and guide users on how to make sure their notebooks run on CPU. Specifically, we can provide functions in fastpages that check whether it’s running in a guthub action, or if a GPU is available, and based on that do something else in the notebook (i.e. don’t run the training loop, just grab saved weights from somewhere).

Note: alternative approach to conditional execution in nbdev_test_nbs would be to add a command to nbdev itself:

#skiptest
# don't run this cell in `nbdev_test_nbs`.

4. Support for GPU-backed CI.

We can also store user credentials to a cloud of choice, say GCP, and make a github action that would spin up a spot GPU instance, and run all the notebooks (or just new / changed notebooks) on that instance! That way, users get full GPU CI on their research articles, guaranteeing reproducibility.

That CI might also report the metrics back, not just exit(0), i.e. plot a loss function and compare accuracy to a threshold value, provided in the page’s front matter metadata.

I’ll think more about what’s the best area for me to focus on, and implement a thing or two. I won’t move forward with nbdev-backed lib thing before we discuss it, and I’ll try to write down more details about it.

The thing I’ll start with is I’ll move my blog to the updated fastpages and play around with it. If I feel like I’ll want to move things around, I’ll document these in PRs.

hamelsmu · February 14, 2020, 4:52am

Hi @xnutsive ! Some things that could use some help at the moment:

Review this pr
I really like your idea #2 for Local runner with live reload

I’m not sure about the idea of running actual code in the notebook inside Actions themselves (#3 and #4) , because Actions is a very resource-constrained envirnonment. The way to normally accomplish this is to use something like Kubernetes where you can define your infrastructure as code and deploy the notebook to get run there. However, I feel like that would be overkill and too complicated for most people ( I can barely figure out how to do all these things to be honest and it is a pain ).

As for making this an nbdev based project (#1) where we develop everything in nbdev, I am not sure about that either as there isn’t a great deal of python code (it is only used to drive the GitHub Actions), and it is difficult to test these Action scripts interactively in a notebook as they must be in Docker containers to emulate the GitHub Actions environment correctly (the idea is if you are going to test locally it should mirror the Actions environment).

That being said, I could be missing something or sometimes I can be dense. Please let me know if I have overlooked something. Also, given your involvement in this project, I would be happy to meet on hangouts or zoom (please DM me either here, twitter, etc) and we can talk real time as I feel like you have lots of good ideas in your head.

Thank you so much for these ideas, really appreciate it.

jean1 · February 19, 2020, 10:53am

Hello，My blog has been created based on the fastpage template. If the source code of the fastpage template is updated, how can I keep my blog updated?I am new to github.

hamelsmu · February 20, 2020, 6:23am

Hi @jean1 you have to do this through git

git remote add upstream https://github.com/fastai/fastpages.git
git fetch upstream
git merge upstream/master

If you are using hub cli for GitHub I believe this command will work but haven’t tried that: https://hub.github.com/hub-sync.1.html

s.s.o · February 20, 2020, 7:54am

@hamelsmu, thank you fro the reply. I was wondering that too. Could we include it in the docs…

drscotthawley · February 20, 2020, 9:41am

@hamelsmu For those three commands you just listed: The first two work fine. For the last one, I get…

$ git merge upstream/master
fatal: refusing to merge unrelated histories

This is for my repo created from your “click here” method of forking. What should we do when we get that error?

hamelsmu · February 20, 2020, 2:40pm

@drscotthawley try this https://www.educative.io/edpresso/the-fatal-refusing-to-merge-unrelated-histories-git-error

drscotthawley · February 20, 2020, 10:35pm

Ok @hamelsmu. so

$ git merge upstream/master --allow-unrelated-histories
warning: Cannot merge binary files: favicon.ico (HEAD vs. upstream/master)
Auto-merging index.md
CONFLICT (add/add): Merge conflict in index.md
Auto-merging favicon.ico
CONFLICT (add/add): Merge conflict in favicon.ico
Auto-merging assets/main.scss
CONFLICT (add/add): Merge conflict in assets/main.scss
Auto-merging about.md
CONFLICT (add/add): Merge conflict in about.md
Auto-merging _setup_pr_template.md
CONFLICT (add/add): Merge conflict in _setup_pr_template.md
Auto-merging _layouts/post.html
CONFLICT (add/add): Merge conflict in _layouts/post.html
Auto-merging _layouts/notebook.html
CONFLICT (add/add): Merge conflict in _layouts/notebook.html
Auto-merging _includes/head.html
CONFLICT (add/add): Merge conflict in _includes/head.html
Auto-merging _config.yml
CONFLICT (add/add): Merge conflict in _config.yml
Auto-merging _action_files/settings.ini
CONFLICT (add/add): Merge conflict in _action_files/settings.ini
Auto-merging _action_files/fastpages.tpl
CONFLICT (add/add): Merge conflict in _action_files/fastpages.tpl
Auto-merging _action_files/Dockerfile
CONFLICT (add/add): Merge conflict in _action_files/Dockerfile
Auto-merging README.md
CONFLICT (add/add): Merge conflict in README.md
Auto-merging .github/workflows/setup.yaml
CONFLICT (add/add): Merge conflict in .github/workflows/setup.yaml
Auto-merging .github/workflows/ci.yaml
CONFLICT (add/add): Merge conflict in .github/workflows/ci.yaml
Automatic merge failed; fix conflicts and then commit the result.

Whoa… so many conflicts in files I never touched! Not sure if I’m going to succeed at keeping/replacing the right parts, but I’ll get started with vi…
EDIT: Got it all done. Thanks.

jean1 · February 21, 2020, 8:02am

My situation is the same as you. I wonder if this is caused by the difference between fork and template.Should we fork instead of creating through templates?

hamelsmu · February 21, 2020, 2:16pm

The best thing to do right now is to copy your blog posts (notebooks, word, markdown posts) to the a new repo you create from the new template. This is going to offer the easiest path at the moment

If you do significant customizations to your site the upgrade is going to be difficult, not sure there is a solution to that, yet.

nok · February 22, 2020, 9:58am

Thank! love this tool a lot, it make the blogging experience more seamlessly that I can just write in notebook and commit without worrying how to convert it to markdown later.

One small opinion about the page, it would be great if in the index page it could show more than just the blog title, I am thinking of also showing the summary from the notebook and optionally with images.

Another thing, I would like to change is the style/theme for the code block, I found the dark theme does not look very nice with a white background.

herchu · February 22, 2020, 5:03pm

I had tried different Github Pages themes but the rest of the supported ones didn’t work as I expected. It’s probably due to my lack of knowledge in jekyll&co. Also note that I tested them within the original fastpages template.

hamelsmu · February 22, 2020, 6:01pm

Changing themes is not supported because there are too many opportunities for CSS collisions. fastpages is a very opinionated platform with the trade off of making it easy for you.

Re: syntax highlighting that is something you can customize via CSS yourself if you so choose but is not something I plan on working on anytime soon. PRs are welcome on this, however.

Edit: you can use your own theme if you have enough knowledge of Jekyll and GitHub Actions, but it requires some work for each person. I think we would take a blog submission for someone who wants to go through this adventure and document their journey!

hamelsmu · February 22, 2020, 6:17pm

@nok I don’t know about putting picture previews, I’ll leave that up to the user to customize that. I do agree that having a summary under the title is probably necessary.

Added the below feature via https://github.com/fastai/fastpages/pull/95

You will have to copy all of your posts over if you want to upgrade.

nok · February 23, 2020, 1:30am

Thank you. Do I have to recreate from fastpages and copy my notes over? Since the first PR will modify the repo quite a bit, I am not sure how to pull the changes to upgrade

jeremy · February 23, 2020, 1:43am

Remind me: does fastpages support creating twitter cards for posts? We have that in fast.ai, but I don’t recall if it was copied over to fastpages. (If not, it probably should be, because it’s handy!)

The reason I mention it here is because it uses a special key image in the frontmatter, which could also be used in a post summary I guess.

hamelsmu · February 23, 2020, 1:49am

Hi @nok the way I would do it ( we will try to automate this at some point but bear with us):

create a new repo from the template
merge the PR etc. from the new repo
copy all your posts over manually, add, commit and push them to master.

Hope that helps

hamelsmu · February 23, 2020, 1:51am

@jeremy

Yes fastpages has the jekyll-seo-tag plugin (I believe it comes with minima) that supports the image and description front matter. I haven’t tested the image part of it but the description works.

I can add this to my backlog to test and verify that this works

Edit image previews in twitter work via the image: front matter

However, at the moment doesn’t seem to:

support remote urls
you can only reference images in assets/ or images/
this is one of those cumbersome situations if you reference your image locally to the notebooks folder, it will not work.

@jeremy How are you doing it now? Is this the same thing you are doing?

[fastpages] GitHub Pages Blog Using Nbdev

1. In order to provide documentation and ease of use for new users, we can refactor fastpages to be a nbdev project

2. Local runner with live reload

3. Support running all the notebook-backed posts in a separate CI action on CPU

4. Support for GPU-backed CI.

1. In order to provide documentation and ease of use for new users, we can refactor `fastpages` to be a `nbdev` project