Jupyter Notebook Enhancements, Tips And Tricks


(Stas Bekman) #1

Let’s have a thread dedicated to various enhancements and goodies on Jupyter Notebook usage.

Please contribute your tips and improvements that make our lives easier. Thank you!

I will start:

Go To The Current Running Cell Keyboard Shortcut

I often find myself scrolling through the running notebook, trying to find the currently running cell.

I wrote a keyboard shortcut that takes me there directly using Alt-I.

If you’d like to the same functionality add the following code to ~/.jupyter/custom/custom.js (you may need to create the folder and the file if you don’t have them already):

// Go to Running cell shortcut
Jupyter.keyboard_manager.command_shortcuts.add_shortcut('Alt-I', {
    help : 'Go to Running cell',
    help_index : 'zz',
    handler : function (event) {
        setTimeout(function() {
            // Find running cell and click the first one
            if ($('.running').length > 0) {
                //alert("found running cell");
                $('.running')[0].scrollIntoView();
            }}, 250);
        return false;
    }
});

You can experiment with this code also by creating a code cell in your notebook and adding:

%%javascript

on the first line, and then pasting the above code on the following lines, and running the cell. it will affect only the current notebook.

I got the idea from https://stackoverflow.com/questions/44273643/how-can-i-jump-to-the-cell-currently-being-run-in-a-jupyter-notebook

wrt implementation I’m not sure if Alt-I is the best choice - suggestions are welcome.

Next, it’d be nice to have a special mode where the notebook automatically re-focuses the view on the currently running cell, so one could easily follow Run All Cells hands off. I don’t know yet enough about jupyter innards to know how to code that one, so suggestions are welcome.

Thank you.


(Stas Bekman) #2

Pretty Print All Cell’s Outputs (and not just the last output of the cell)

Normally only the last output in the cell gets pretty printed - the rest you have to manually add print() which is not very convenient. So here is how to change that:

At the top of the notebook add:

from IPython.core.interactiveshell import InteractiveShell

# pretty print all cell's output and not just the last one
InteractiveShell.ast_node_interactivity = "all"

Examples:

Normal behavior: only one output is printed:

In  [1]: 4+5
         5+6

Out [1]: 11

After the new setting is activated both outputs get printed:

In  [1]: 4+5
         5+6

Out [1]: 9
Out [1]: 11

To restore the original behavior add:

from IPython.core.interactiveshell import InteractiveShell

# pretty print only the last output of the cell
InteractiveShell.ast_node_interactivity = "last_expr"

note: you have to run the setting change in a separate cell for it to take effect for the next cell run.

To make this behavior consistent across all notebooks edit: ~/.ipython/profile_default/ipython_config.py:

c = get_config()

# Run all nodes interactively
c.InteractiveShell.ast_node_interactivity = "all"

The tip was found here: https://stackoverflow.com/a/36835741/9201239

Suppressing Output

The side-effect of this change is that some libraries will start spewing a lot of noise (e.g. matplotlib). To fix that add ; at the end of the lines whose output you don’t want to be displayed.

Example:

fig, axes = plt.subplots(2, 1, figsize=(20, 20))
axes[0].set_ylim(-5, 5);

Without using ; it outputs:

Out[53]: (-5, 5)

However the hack doesn’t seem to work when the code line that outputs something is part of some loop, in which case the workaround of assigning to _ does the trick:

fig, axes = plt.subplots(2, 1, figsize=(20, 20))
for i in [1,2]:
    _ = axes[i].set_ylim(-5, 5)

(Stas Bekman) #3

Found a collection of tips here:


(Stas Bekman) #4

To Skip A Cell From Running (e.g. work in progress)

Add at the top of the cell:

%%script false

some multiline code 
that you want to skip for a time being 
(e.g. work in progress) 
without commenting out / deleting cell 
goes here

(Stas Bekman) #5

To Include Markdown in Your Code’s Output (Colors, Bold, etc.)

Just using print() often makes it difficult to have certain outputs standout in the sea of outputs.

https://stackoverflow.com/a/46934204/9201239 suggests a way to fix that and be able to include markdown in your output - here is a reduced version on the original post:

from IPython.display import Markdown, display
def printmd(string):
    display(Markdown(string))

printmd("**bold text**")

You can add color using html:

def printmd(string, color=None):
    colorstr = "<span style='color:{}'>{}</span>".format(color, string)
    display(Markdown(colorstr))

printmd("**bold and blue**", color="blue")

I currently started using this for printing scores - it stands out nicely from the rest of the noise.


(Arka Sadhu) #6

These are pretty amazing set of tips. Thanks a lot for this. I will share mine but this worked on only one of my remote connections and I haven’t figured out why that happened.

So when you are doing a large computation using jupyter and there is loss of connection. There is no way to get the output of the cell back. This https://github.com/ipython/ipython/issues/4140 issue describes it in more detail. jeanpijon and lucasb-eyer solution are decent solutions for this.

Would be interested if someone found any different hack.


(Stas Bekman) #7

Restart and Run All fix

My browser is too slow and “Restart and Run All” only manages to restart, but fails to “Run All”. This adds the missing delay and as a bonus a shortcut Meta-F (to complement Alt-F=Run All ignoring errors). Runs from ‘Command Mode’ (i.e. may need to hit Esc first).

// Meta-F: "Restart and Run All" slow delay fix + shortcut [Command mode]
Jupyter.keyboard_manager.command_shortcuts.add_shortcut('Meta-F', {
    help: 'Restart And Run All',
    help_index: 'zz',
    handler: function (event) {
      Jupyter.notebook.kernel.restart();
      restartTime = 2000 // decrease this if you have a fast computer
      setTimeout(function(){ Jupyter.notebook.execute_all_cells(); }, restartTime);
      return false;
    }
});

see earlier posts for where to add this code.


(Stas Bekman) #8

Control notebook resource allocation

Sometimes notebooks take a way more resources than available, causing undesired behavior, like a hanging system.

There are various ways to limit system resource usage, I decided to try the modern cgroups to limit memory usage in this case.

sudo cgcreate -a stas:stas -t stas:stas -g memory:JupyterGroup

replace stas with your setup’s user:group (usually yourusername:yourusername).

Then I decided to give each notebook max of 15GB RAM and 25GB Total (RAM+SWAP):

sudo echo $(( 15000 * 1024 * 1024 )) > /sys/fs/cgroup/memory/JupyterGroup/memory.limit_in_bytes
sudo echo $(( 10000 * 1024 * 1024 )) > /sys/fs/cgroup/memory/JupyterGroup/memory.kmem.max_usage_in_bytes

Now I start the notebook as:

cgexec -g memory:JupyterGroup jupyter notebook

a greedy kernel now gets terminated if it consumes more than what I allocated, without crashing/overloading my system.

Note that filenames under /sys/fs/cgroup/memory/JupyterGroup/ might be slightly different on your system. You may have to check the documentation.

I also needed to install the following packages, before I was able to run this:

sudo apt-get install cgroup-bin cgroup-lite cgroup-tools cgroupfs-mount libcgroup1


(Stas Bekman) #9

Following the currently executing cell mode

Won’t it be nice to be able to watch the progress of the notebook run hands off? Now you can:

Add the following in ~/.jupyter/custom/custom.js and reload the notebooks you’re running:

/*
 In Command mode Meta-[ toggles Follow Exec Cell mode, Meta-] turns it off.

 To adjust the behavior you can adjust the arguments:
 * behavior: One of "auto", "instant", or "smooth". Defaults to "auto". Defines the transition animation.
 * block:    One of "start", "center", "end", or "nearest". Defaults to "center".
 * inline:   One of "start", "center", "end", or "nearest". Defaults to "nearest".
 https://developer.mozilla.org/en-US/docs/Web/API/Element/scrollIntoView
*/
function scrollIntoRunningCell(evt, data) {
    $('.running')[0].scrollIntoView({behavior: 'smooth', inline: 'center'});
}

Jupyter.keyboard_manager.command_shortcuts.add_shortcut('Meta-[', {
    help: 'Follow Executing Cell On',
    help_index: 'zz',
    handler: function (event) {
        Jupyter.notebook.events.on('finished_execute.CodeCell', scrollIntoRunningCell);
        //console.log("Follow Executing Cell On")
        return false;
    }
});

Jupyter.keyboard_manager.command_shortcuts.add_shortcut('Meta-]', {
    help: 'Follow Executing Cell Off',
    help_index: 'zz',
    handler: function (event) {
        Jupyter.notebook.events.off('finished_execute.CodeCell', scrollIntoRunningCell);
        //console.log("Follow Executing Cell Off")
        return false;
    }
});

Now in Command Mode (when cell in focus has a blue box around it and not green, or hit Esc to toggle mode), hit Meta-[ to get the currently run cell stay in the middle of the screen, hit Meta-] to return to normal behavior.

If this is not working, debug this setup by uncommenting console.log() calls and watch your browser Developer Tools’ Console to check that custom.js got loaded without errors and that the shortcuts got registered and the handler is activated. Sometimes you need to restart jupyter notebook, but most of the time tab-reload works.

If you just want to jump once to the current executing cell use Alt-I after you add the following to ~/.jupyter/custom/custom.js and reload the notebooks you’re running:

// Alt-I: Go to Running cell shortcut [Command mode]
Jupyter.keyboard_manager.command_shortcuts.add_shortcut('Alt-I', {
    help : 'Go to Running cell',
    help_index : 'zz',
    handler : function (event) {
        setTimeout(function() {
            // Find running cell and click the first one
            if ($('.running').length > 0) {
                //alert("found running cell");
                $('.running')[0].scrollIntoView();
            }}, 250);
        return false;
    }
});

Caveat: for it to work - the sections should all be uncollapsed - otherwise it won’t know to go into a collapsed section.

You can adjust the activation shortcut keys to your liking.

Remember that all 3 shortcuts will only work in the Command mode (see above for figuring that out).

This has been tested to work with jupyter notebook 5.6.0 with python 3.6.6.


(nok) #10

Thanks! works perfectly! Do you know if there is a way to freeze the view of a cell? Kind of like the Excel Freeze Pane function. It is useful for referencing some earlier cell without scrolling.

image


(nok) #11

Btw the shortkey seems does not work in Firefox, but work fines in Chrome. Is there an easy way to activate the custom function instead of usingn shortkey to activate everytime? Thank you.


(Stas Bekman) #12

Yes, I switched to using chrome for jupyter long time ago, since some things like some short keys don’t seem to work in firefox.


(nok) #13

For some reason, I cannot install Chrome on my laptop, it crashes everytime I start a browser.


(Stas Bekman) #14

if it’s a unix system, use gdb’s core dumping or strace to see where it fails and google the outcome?

usually moving the old config folder away solves this kind of problems:

mv ~/.config/google-chrome ~/.config/google-chrome-last

any luck with chromium?


(nok) #15

Thanks for your help, but I am in Window… no luck so far, have been searching for quite some time before I switch to Firefox


(Stas Bekman) #16

nbdime: Selective Diff/Merge Tool for jupyter notebooks

I also need to share very useful tips about nbdime which is very useful for working with jupyter notebooks.

Install it first:

pip install -U nbdime

it should automatically configure it for jupyter notebook. If something doesn’t work, see installation.

Then put the following into ~/.jupyter/nbdime_config.json:

{

  "Extension": {
    "source": true,
    "details": false,
    "outputs": false,
    "metadata": false
  },

  "NbDiff": {
    "source": true,
    "details": false,
    "outputs": false,
    "metadata": false
  },

  "NbDiffDriver": {
    "source": true,
    "details": false,
    "outputs": false,
    "metadata": false
  },

  "NbMergeDriver": {
    "source": true,
    "details": false,
    "outputs": false,
    "metadata": false
  },

  "dummy": {}
}

Change outputs value to true if you care to see outputs diffs too.

Now when you do:

git diff

Instead of getting a very noisy and hard to parse normal diff:

--- a/examples/tabular.ipynb
+++ b/examples/tabular.ipynb
@@ -2,12 +2,12 @@
  "cells": [
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 1,
    "metadata": {},
    "outputs": [],
    "source": [
-    "from fastai import *          # Quick access to most common functionality\n",
-    "from fastai.tabular import *  # Quick access to tabular functionality\n",
+    "from fastai import *          # Very Quick access to most common functionality\n",
+    "from fastai.tabular import *  # Very Quick access to tabular functionality\n",
     "from fastai.docs import *     # Access to example data provided with fastai"
    ]
   },

You will get this sweetness:

snap5

The second feature I like is that now your notebook has a button: [nbdiff] along all the tools and it’ll show you the diff right in your browser!

snap6

And did I say that it does notebook merging too! Whoah! Use the setting above and it’ll ignore any noise when merging (i.e. metadata, execution_count, etc.), and only merge code cells! Of course you can adjust the configuration to suite your needs.

For the full docs see its website.


(nirant) #17

Hey @stas, thanks for this. This is amazing!

I took what I learned here, combined with what I learnt through fast ai and daily use of Jupyter notebook into this one place: https://github.com/NirantK/best-of-jupyter

Hope you find this useful!


(nirant) #18

For those coming here for the first time, here are some direct links to what you do better in your Jupyter usage:

Contents


(Stas Bekman) #19

Tell the notebook to save itself now

juputer notebook by default autosaves itself every 5 min or so if you haven’t changed the defaults.

But if you want to make sure the notebook is saved at the end of the run, you can just insert a new cell at the end of your notebook and make sure you run it:

%%javascript
IPython.notebook.save_notebook()

now your notebook will be always saved as soon as it’s done running.

This is useful if you’re then immediately needing to commit the change to git.

credit: the idea came from here.