How to include citation in nbdev exported html?

I have been experimenting with nbdev on and off for past few days . Having spending most of my time on trying to work on a model on notebooks and then trying to extract relevant module and scripts to provide some common functionality to my team to minimize replication, I see this tool as great fit for daily workflow for our data science team.

I would like to figure out a way to include citation in my exported output . On jupyter side I am able to use cite2c package to view by refences .

However this doesn’t work in exported jekyll output.

Some of the attempts after doing preliminary google search suggested jekyll-scholar plugin for cite2proc-js which I believe is used internally in cite2c but my knowledge is very limited on jekyll, js or jupyter plugin side.

Please suggest if there is a recommended or alternate workflow to include citation ( inline and as references at the end) with nbdev

It uses kramdown, which supports footnotes, so that’s possibly the simplest approach. Details:

https://kramdown.gettalong.org/quickref.html

Hi @jeremy

Thank you for your response and wonderful libraries you have been introducing through fastai. I am learning more about coding and data science just by copy pasting your code line by line than anything else available presently.

I have now explored a bit more on your suggestion and it still doesn’t exactly provide what I need . Here are some of my findings

  • Markdown operations related to kramdown macros(if we can call them that) are constrained with in a jupyter markdown cell. When I try to do footnotes with in the same cells output is rendered . However , it doesn’t work when they are separated. I have few guesses on why this might be the case

INPUT

OUTPUT

Additionally a few more things don’t work

Although I am mostly a python person, I have recently finished a significant technical report (~100-120 pages)using bookdown package from R-studio. During this effort , following features where a big boost in productivity

  • Create a separate bib file

  • Include the citation anywhere in rmarkdown file, from figure caption to text, just by calling a [@bibkey] was a significant boost in productivity

  • Generate a section at the bottom of rendered html with all the cited reference

I would love to find a solution which can let me do something similar with nbdev. Even though it might have some additional pre-processing steps which are outside nbdev scope. Any suggestions on how to go about building something similar or adopting an existing solution would be highly appreciated.

1 Like

I think you have to use html syntax here to make it work.

You’ll need to run jekyll yourself to do this, and use this package:

1 Like

Links in quotes in general will work - just not in that specific location, since that’s where the page summary goes.

Thanks @jeremy and @sgugger

Since I am new to jekyll world, exploring these options is going to take some time. I have reviewed following things so far :-

  • Idea 1: Starting from scratch and learning basics of jekyll template and jekyll plugins. Online recommendations include following
    • Jekyll scholar is not included on github pages so either use it to build _site locally or use a service which can let you build the same. Netlify seems to have worked for people. I prefer building _site from netlify deploy settings but havenot succeeded in doing the same so far.

    • Building with jekyll locally on windows is giving me a lot of pain. I could use linux VM to learn the basics which I think I should do .But I want to figure out a windows solution first as most of my team is on windows machine with GPU.[GPU won’t work on VM side]

  • Idea 2: Hack a command which can do some preprocessing before using nbdev_build_docs

    • Looking at generated html from nbdev_build_docs in my docs folder; I find html has been stripped of all the metadata which includes metadata added by from cite2c plugin from jupyter; which means whatever I do these will not work
      image

    • Read a bit of source code to figure out what is happening in notebook2html function. I am still not certain of all the things HTMLParsers are doing . This would probably require much more effort.

    • I like how I can dynamically see references getting included by cite2c plugin in rendering in jupyter notebook. In above approach there might be some hack to read metadata -> find all ‘cite’ tags using regex -> do some external call cite2proc-js -> replace string output -> call nbdev_build_docs. However this would require much more time to explore which I currently lack.

I think I have now spend more time on this then I intended. I will keep learning & exploring on and off whenever I can afford some more time.

Ok Got a tentative solution working with nbdev using jekyll scholar as @jeremy suggested . Steps to make it work

  • If you are on windows , make sure ruby 2.6.2 is installed and not 2.7

  • Update Gemfile as follows

source "https://rubygems.org"
gem "rake"
gem "jekyll-scholar", group: :jekyll_plugins
gem 'wdm', '>= 0.1.0' if Gem.win_platform?
gem 'github-pages', group: :jekyll_plugins

# Added at 2019-11-25 10:11:40 -0800 by jhoward:
gem "jekyll", "~> 3.7"
  • Run bundle install . This will update dependencies and Gemfile.lock

  • Add _plugins/jekyl_scholar.rb in docs folder with following content
    `

require ‘jekyll/scholar’

  • Add a _bibiliography/references.bib . This file contains all the citations.

  • Update _config.yml as follows

scholar:
   source: _bibliography
  • Add line {% bibliography %} or {% bibliography --cited %} in one of the markdown cells in notebook

  • Hosting site on netlify, So Git push works. [Don’t have to build locally. I am still struggling with viewing local docs , sidebar links not working at all]

  • Here is the final result

Hi,

I would like to propose my solution for citations, https://github.com/ducha-aiki/nbdev/blob/latex_envs_citations/nbs/03_export2html.ipynb

Specifically, it consists of two stages:

  1. Install jupyter extension https://github.com/jfbercher/jupyter_latex_envs
    Then one could use latex-style \cite{Ref1} in markdown + bibtex file

It also adds 3 buttons to jupyter notebook (at the right):

After click on “Refresh” and “Book” buttons above, one gets following rendering in jupyter notebook

And the reference section in the end of notebook, like this:

  1. export to html with modified nbdev, specifically, I have added the following

The final result of fastpages-generated is below, where links are same-page links to and from references.

If anyone is interested, I can PR that branch into nbdev. @sgugger

P.S. You can see result here: https://ducha-aiki.github.io/wide-baseline-stereo-blog/2020/03/27/intro.html

1 Like

@ducha-aiki There is already a way to do this in fastpages! See https://github.com/fastai/fastpages/blob/master/_fastpages_docs/NOTEBOOK_FOOTNOTES.md

Detailed Guide To Footnotes in Notebooks

Notebook -> HTML Footnotes don’t work the same as Markdown. There isn’t a good solution, so made these Jekyll plugins as a workaround

This adds a linked superscript {% fn 15 %}

{{ "This is the actual footnote" | fndetail: 15 }}

You can have links, but then you have to use single quotes to escape the link.

This adds a linked superscript {% fn 20 %}

{{ 'This is the actual footnote with a [link](www.github.com) as well!'  | fndetail: 20 }}

However, what if you want a single quote in your footnote? There is not an easy way to escape that. Fortunately, you can use the special HTML character ' (you must keep the semicolon!). For example, you can include a single quote like this:

This adds a linked superscript {% fn 20 %}

{{ 'This is the actual footnote; with a [link](www.github.com) as well! and a single quote ' too!'  | fndetail: 20 }}

1 Like

Will copy an extended answer from twitter for the sake of completeness:

footnote-based solution does not fit me because of :

  1. Things mentioned by @rahuketu86

During this effort , following features where a big boost in productivity

  • Create a separate bib file
  • Include the citation anywhere in rmarkdown file, from figure caption to text, just by calling a [@bibkey] was a significant boost in productivity
  • Generate a section at the bottom of rendered html with all the cited reference
  1. I generate not only blog post from the notebook, but also latex source for my thesis. Making this two things separately (nicely formatted blog and nicely formatted latex->pdf) creates too much burden.

@ducha-aiki thanks for sharing. Haven’t thought about multiple target outputs. It looks like you may have a plan on how to do that via nbdev, so look forward to watching that thread.

Well,I already have implemented it, see above.

It works in fastpages locally, but fails on github, I suppose because I am failed to make github actions custom docker image instead of standard

Yeah I can help with that. Why don’t you submit a PR to nbdev? Looks like a reasonable set of features to me? Based on my experience, this is something that has a reasonable probability of being merged. ( @sgugger of course will review)

Once that is done, I can bump all the official images for fastpages to include the changes.

Sure. The reason I haven’t submitted PR yet is because I am following the contributing guide)))

“Once your approach has been discussed and confirmed on the forum, you are welcome to push a PR, including a complete description of the new feature and an example of how it’s used. Be sure to document your code in the notabook.”

All I would advise is to make sure you include tests ( maybe you already have them ) in your changes

I guess another thing you want to check for is there are no collisions with other features, somehow. I have some suspicion that this could collide with other magic that handles links, But I don’t think it will - nothing seems off to me, but just asking to make sure

Yes, I have one, although not sure is it is exported to the tests:

#hide
cell = {'cell_type': 'markdown', 'source': r"""This is cited multireference \cite{Frob1, Frob3}.
And single \cite{Frob2}."""}
expected=r"""This is cited multireference [<a class="latex_cit" id="call-Frob1" href="#cit- 
Frob1">Frob1</a>,<a class="latex_cit" id="call-Frob3" href="#cit-Frob3">Frob3</a>].
And single [<a class="latex_cit" id="call-Frob2" href="#cit-Frob2">Frob2</a>]."""
test_eq(_cite2link(cell)["source"], expected)
'''
1 Like

Following up on this, a few months later:
@hamelsmu Is there now any consensus for how can we best (i.e. most easily) incorporate citations from a bibtex file into our Fastpages blogs? I’m fine with trying to plug in jekyll-scholar, just wasn’t sure if this was still a “you’re on your own” kind of thing or if it’s now been ‘standardized’ somehow. (I realize such a feature is probably beyond the scope of your original vision for this increasingly successful blogging platform! :wink: )

Btw, I’m enjoying writing ‘pure Markdown’ posts using Typora (which I find to be a ‘beautiful’ editor), rather than writing in Jupyter. Alternatively, I could switch to doing these “Markdown-only” posts in just one big Jupyter cell and then use @ducha-aiki’s method. But…that would not be my preferred method.

(If it ends up being that I’d need to run some sort of manual script after editing, such that the build is not fully automated, I’m actually ok with that.)