Jupyter Notebook Enhancements [Discussion]

Thanks, but I don’t want to strip or clear the outputs; I want to clear the input prompt numbers, meaning the 4 in this cell:

image

Hmm, have you tried this command from your command line

nbstripout --keep-output FILE.ipynb

See if that works.

…actually, I just tried on a test nb and it doesn’t work running the entire nb either - meaning, doing Restart & Run All. I’m on GCP; does that matter?

Is there a way to convert notebook to script back and forth? I would like to convert notebook to script as it is easier to refactor in IDE, but then is there an easy way to convert back to notebook.

I am looking for the answer of this question too…

As far as I remember, Pycharm can run Jupyter notebooks as it is without converting it to script. But then, unfortunately we cannot use pycharm to debug the Jupyter notebook, which make it not useful. This has been a 4 years old feature request.
But maybe this hack will work to debug jupyter notebooks with pycharm… I haven’t tried it yet.
Here is the blog post of the hack described:

See this fastai forum thread for a similar request.

Here is a way to debug a notebook with Pycharm, which I am not convinced to be a good solution.

This is a promising Visual Python Debugger for Jupyter Notebooks project, that did not work for me well. Not always the breakpoints break and seems still buggy.

1 Like

I searched a bit about your question.

I found Jupytext a promising project that syncs Jupyter notebooks as plain text documents or as .py files. (It is made by Marc Wouts, the same author of the debug jupyter notebooks with pycharm method I mentioned in my previous post)

Jupytext can save Jupyter notebooks as

  • Markdown and R Markdown documents,
  • Julia, Python, R, Bash, Scheme and C++ scripts.

Jupyter will save your notebook to your favorite format(s), from .ipynb to .py , .R , .jl , .md , .Rmd … The text representation can be edited outside of Jupyter. Simply refresh the notebook in Jupyter to get the latest input cells from the script or Markdown document. When refreshing, kernel variables are unaffected, and output cells are reloaded from the traditionnal .ipynb if present. You can also delete your .ipynb notebook entirely if you don’t need to save output cells.

Collaborating on Jupyter Notebooks

With Jupytext, collaborating on Jupyter notebooks with Git becomes as easy as collaborating on text files.

The setup is straightforward:

  • Open your favorite notebook in Jupyter notebook
  • Associate a .py representation (for instance) to that notebook
  • Save the notebook, and put the Python script under Git control. Sharing the .ipynb file is possible, but not required.

Collaborating then works as follows:

  • Your collaborator pulls your script. The script opens as a notebook in Jupyter, with no outputs.
  • They run the notebook and save it. Outputs are regenerated, and a local .ipynb file is created.
  • They change the notebook, and push their updated script. The diff is nothing else than a standard diff on a Python script.
  • You pull the changed script, and refresh your browser. Input cells are updated. The outputs from cells that were changed are removed. Your variables are untouched, so you have the option to run only the modified cells to get the new outputs.

@jeremy I remember you were working on a better version control of Jupyter notebooks, where we can exclude cell outputs. Maybe Jupytext is an ideal solution that can do that, and on top of that, it allows us to sync Jupter notebooks to our favourite IDE. Affectively we can use the good of both worlds. Nice debugging in IDE and nice inline outputs in Jupyter notebooks.

1 Like

Debugging notebooks with PyCharm is working perfectly for me. See the method at the end of this thread:
https://forums.fast.ai/t/pycharm-setup-help/12197/13

It does not solve the issue of importing a notebook, refactoring, and exporting, but tracing, breakpoints, and code navigation are all available.

1 Like

Thanks. I’ll give it a try.

I think both methods have their own advantages. Your method allows to debug without running the code again in pycharm.

While jupytext allows you to refactor and use the powerful version control of pycharm and any other tools of a proper IDE. But then running the code again in pycharm sometimes it is resource heavy. Save and load models can minimize that downside though.

I will check your method and report back.
I guess, you cannot put a breakpoint on a line that does not belong to fastai library?

You can put a breakpoint in any file in the fastai conda environment. But it needs to be a line in a file. There might be some trick for breaking into Pycharm directly from the notebook, but I suspect you’d end up tracing the ipython interpreter. I’ll try it later.

I think that does make sense.
Maybe we can import a dummy python library just to make it break inside it. There shouldn’t be any difference than breaking a fastai library.

Please note, I gathered all the discussion about this topic and moved it to this newly created thread dedicated for discussions and problem resolution. The original thread is to compile complete working solutions and no discussions there to make it easy for people to actually find recipes w/o needing to scroll forever.

In case anyone else wants this - I cross posted in Part 1 v3 and did get the answer. See https://forums.fast.ai/t/general-course-chat/24987/333?u=ricknta

How does this part work? How to open script in note book and getting back all the cells layout ?

It is shown in the demo :slight_smile:
Demo time

Introducing Jupytext PyParis Binder

Looking for a demo?

That could be a solution for my usecase, thank you! I will check this out soon.

Hi, I have experiment with Jupytext a bit over the last week.

I think it suggests using .py as the source, while notebook are served as caching the output for more nicely formatted presentation.

It also comes with the ability to sync .ipynb and py where everytime you save the notebook, it updates the script (not the reverse way, you will need to run a command to do make changes in .py reflected in notebook).

So there is really 2 approaches to work:

  1. Use .py mainly, you can just open .py in notebook and use it like a notebook, save a .ipynb when you need to present your output.

  2. Use .ipynb mainly, keep a .py script always sync with the notebook. The .py keep code diff more nicely.

Which approach is better in your opinion?

It is up to your use case and taste. For me I don’t need the diff too much, and mostly want to debug notebooks in pycharm. So I prefer this approach. See this for an example.

One of the major issue for me using Jupytext is that you have run 2 scipts at the same time. Which is resource intesive when the scripts are DL stuff. So I switched to only debugging using pycharm.

Looking for putting the “Go to current running cell” into an extension.
Also I would like to create a functionality where I am jump to the “recent” / last selected cell, so I can jump back and forth between two cell.

I am not sure if this is doable with extension or this need to be done within the jupyter library directly.

If someone experience in creating extension and famliar with Jupyter, could you give me some pointers where should I focus on. I am not too familiar with JS, but I think there is quite some code snippets floating around, my job is to gather them and put into a right place.

1 Like

I like this function so much that I have build an jupyter extension. Unfortunately the PR process is rather slow. Here is my PR. The functionality was similar as what @stas built, I add a button and with an dedicate extension you can easily configure the shortcut with the nbextension configurator.

Great idea, @nok.

Unfortunately, it will not always work due to this bug in the notebook software:


So go to current running cell will take you to the first stuck cell instead forever.

I think more people need to chime in that bug report with words and not just +1s so that someone at jupyter dev will actually do something about it.

And I hope you also integrate the autoscroll with your extension too (except it has the same problem because of that bug).