Jupyter Notebook Enhancements [Discussion]

(Haider Alwasiti) #29

It is up to your use case and taste. For me I don’t need the diff too much, and mostly want to debug notebooks in pycharm. So I prefer this approach. See this for an example.

One of the major issue for me using Jupytext is that you have run 2 scipts at the same time. Which is resource intesive when the scripts are DL stuff. So I switched to only debugging using pycharm.

0 Likes

(nok) #30

Looking for putting the “Go to current running cell” into an extension.
Also I would like to create a functionality where I am jump to the “recent” / last selected cell, so I can jump back and forth between two cell.

I am not sure if this is doable with extension or this need to be done within the jupyter library directly.

If someone experience in creating extension and famliar with Jupyter, could you give me some pointers where should I focus on. I am not too familiar with JS, but I think there is quite some code snippets floating around, my job is to gather them and put into a right place.

1 Like

(nok) #31

I like this function so much that I have build an jupyter extension. Unfortunately the PR process is rather slow. Here is my PR. The functionality was similar as what @stas built, I add a button and with an dedicate extension you can easily configure the shortcut with the nbextension configurator.

0 Likes

(Stas Bekman) #32

Great idea, @nok.

Unfortunately, it will not always work due to this bug in the notebook software:


So go to current running cell will take you to the first stuck cell instead forever.

I think more people need to chime in that bug report with words and not just +1s so that someone at jupyter dev will actually do something about it.

And I hope you also integrate the autoscroll with your extension too (except it has the same problem because of that bug).

0 Likes

(nok) #33

Interesting, I do experience some lag of * sometimes but have not observe it would stuck forever, will check out your notebook later.

Yes, I have also implemented the autoscroll function with a different default hotkey to activate the function( or you can just check the box in nbconfigurator)

1 Like

(Stas Bekman) #34

Interesting, I do experience some lag of * sometimes but have not observe it would stuck forever, will check out your notebook later.

I didn’t say forever, I said for the duration of the notebook run - i.e. as long as there is at least one cell running, those earlier cells report being busy. And since you’re going after the active cell, it will be the wrong cell - it’ll be the first stuck cell.

Do let me know if you can reproduce it on your side - thank you!

The only culprit with this approach is that the cells much be uncollapsed - if they are collapsed you can’t jump to the current cell.

And when your PRs are approved and the new extensions become available please post a little how-to here Jupyter Notebook Enhancements, Tips And Tricks

1 Like

Jupyter Notebook Enhancements, Tips And Tricks
(nok) #35

I am able to reproduce the bug. Although 1 out of many trials, the cell state seems behave correctly.

0 Likes

(Stas Bekman) #36

Thank you for doing that. Now you can see how it’d impact your extensions. It started happening specifically with python-3.6.0.

0 Likes

(nok) #37

I have no clue why it is associated with python version, I would expect it is from jupyter side. I have done some more testing, but let’s keep the discussion in the issue to draw some attention from jupyter dev hopefully. :

Some minor progress, I have found a weird way to produce a working environment, but root cause is unknown.
see https://github.com/jupyter/notebook/issues/4334#issuecomment-464433137

0 Likes

(Stas Bekman) #38

Thank you for helping me to find out what broke it. I’m going to try to reproduce your progress, @nok. And yes, let’s continue it in the github thread so perhaps we get some attention from the devs.

1 Like

(深度碎片) #39

Hi everyone, @stas

I managed to use a snippet in the following way

the details_drop snippet is simply to insert ‘[/details][details=""]’ into a cell.

But I am looking for a shortcut to insert this snippet by typing a keyword shortcut such as ddrop in a cell, without going to the drop list and sub drop list.

Does anyone have a solution?

Thanks

0 Likes

(Stas Bekman) #40

See https://stackoverflow.com/a/51719689/9201239

I haven’t verified that it indeed works - If you tested it to work as described please add this recipe to Jupyter Notebook Enhancements, Tips And Tricks

Thank you.

p.s. SO is where I find most jupyter tricks - so in the future search there first and then share here!

1 Like

(深度碎片) #41

Thanks a lot! @stas

Here is how I get this to work with the helpful guide you provided above.

1 Like

(Stas Bekman) #42

Looks great, Daniel!

I’d just clarify the title with perhaps more explicit:

“How to add a keyboard shortcut to insert a code snippet”

and I’d remove the note that it’s my guide, since it’s not, I just helped you find it - instead it’s the best practice to link to the SO answer that you’re copying to give the authors the credits. Thanks.

1 Like

(深度碎片) #43

thanks, it is done!

1 Like

(Malcolm McLean) #44

Since starting fastai classes, I have must have typed shape(), len(), and type() a thousand times. What a great feature it would be if the Jupyter notebook automatically told you the type, dimensions, and internal types every time it displays (or assigns) a value. Such a feature would have saved me a lot of time and bugs!

So I made a function that extracts the essential values that we often need to know while writing machine learning code. Here are some examples:

from showType import ShowType
st = ShowType()

a = [[i,i+6]for i in range(6)]
st.type_str(a)
'list[6]<list[2]<int, int>, list[2]<int, int>, list[2]<int, int>, list[2]<int, int>,...>'

npa = np.array(a, dtype=np.float32)
st.type_str(npa)
'ndarray[6, 2]<float32>'

st.type_str(torch.tensor(npa))
'Tensor(cpu)[6, 2]<float32>'

#from Lesson 1 - Pets...
st.type_str(data.one_batch())
'tuple[2]<Tensor(cpu)[64, 3, 224, 224]<float32>, Tensor(cpu)[64]<int64>>'

data = [['1/1/2019', 181, 185.6, 187.3, 180], ['1/2/2019', 185.2, 186.6, 188.3, 182]]
df = pd.DataFrame(data, columns = ['Date', 'Open', 'Close', 'High', 'Low']) 
st.type_str(df)
'DataFrame[2, 5]<Date<object> Open<float64> Close<float64> High<float64> Low<int64> >'

st.type_str(df['Open'])
'Series[2]<float64>'

x = (1, 'wsx', 2.3, ['qaz',7])
st.type_str(x)
'tuple[4]<int, str, float, list[2]<str, int>>'

Ideally I would like this type string to be displayed in addition to the output of every notebook cell. I took a look at the docs for Jupyter extensions. It looks very possible to package into a Jupyter extension - the value of the current cell is accessible, and HTML can format as desired. But I realized that I don’t have the skills or time to figure out how to do it.

If anyone wants to tackle integrating type_str into Jupyter, the code is below for the taking. I, and likely others, would find their code development made easier. Thanks!

import numpy as np
import pandas as pd
import torch

class ShowType():
    # Initial version 20190504 Malcolm McLean

    def __init__(self, width=4):
        self._width = width  #Traverse lists only this far

    def type_str(self, o):  # generate a string that tells us the type and dimensions of o.
        def getLastWord(s):
            ix = s.rfind('.')
            return s if (ix == -1) else s[1 + ix:]

        ts = getLastWord(type(o).__name__)  #The base type name
        rs = ts              # starts the result string

        if hasattr(o, 'shape'):
            if ts == 'Tensor':
                rs += '('+str(o.device)+')'  #Append the device

            rs += str([d for d in o.shape]) #Append the shape dimensions in square brackets

            if ts=='DataFrame':  #List column names and their types
                rs += '<'
                for col, cte in zip(o.columns, o.dtypes):
                    rs += col + '<' + str(cte) + '> '
                rs += '>'
            elif ts=='Series':
                rs = rs+'<'+str(o.dtype)+'>' #Its type
            else:
                rs += '<'+getLastWord(str(o.dtype)) + '>' #If o has a shape, assume elements are homogeneous, append element type.

        elif hasattr(o, '__len__'):
            if (rs == 'str'): return rs  # String has a length but will not be disassembled further

            #o is likely to be a Python tuple or list.
            rs += '['+str(len(o)) + ']' #append the length
            #Show width members of the contents recursively.
            if len(o)>0:
                rs += '<'
                for i,m in enumerate(o):
                    if (i>=self._width): break
                    if i!=0: rs += ', '
                    rs += self.type_str(m)

                if i>=self._width:   #Were there more elements?
                    rs += ',...'
                rs+= '>'            #Close the list of elements

        return rs


# Python Tuple
# -length
# -anything
#
# Python list
#  -length
#  -anything
#
# Python function
#
# Numpy
#  - shape
#  - homogeneous
#  - len yields 1st dimension
#  - a.dtype
#
# PyTorch
#  -shape
#  -homgeneous, t.type()
#  -len
#  -device
#
# Pandas DataFrame
#  -shape
#  -columns,dtypes
#
#  Pandas Series
#  -shape
#  -dtype
#

Edit: Bug fixed for empty list/tuple.

0 Likes

(Malcolm McLean) #45

Hi all. In reference to the above post, I have been using type_str for a week now, and it has proven to be quite the time and bug saver. As in…

from showType import ShowType
t = ShowType().type_str
t(data.one_batch())

If no one steps up to packaging it into a Jupyter extension, I am going to hire an outside expert to do so. Can anyone here refer me to such a person?

Thanks!

0 Likes

(RF) #46

Hello, I work through the courses on the train the work each day, so my time for each session is a bit limited. I’ve reached the point where most of my time is spent waiting for the Jupyter Notebook to re-run everything I had been working on prior so that I can get to the same point as I was when I left off.

Is there a way to save my Jupyter Notebook and open it later (after shutting down my Salamander server) without needing to re-run all the prior code I had been using? This would save me 1-2 hours each day to actually work on the course, so thank you in advance for any tips!

1 Like

#47

Yes, I like that suggestion - kind of like what R lets you do. Save the state of variables and load them the next time you are back on the task.

I have created a short survey of Jupyter usage. Would be helpful if everyone reading this filled it out.

I will share the results with the community here. I also plan to compile all the suggestions brought up in this thread.

0 Likes

#48

Here is the compilation of posts in this thread. I will add more to it if I come across more problems posted by users.

0 Likes