Jupyter Notebook Enhancements [Discussion]

Daniel · March 23, 2019, 2:35am

Thanks a lot! @stas

Here is how I get this to work with the helpful guide you provided above.

stas · March 23, 2019, 3:03am

Looks great, Daniel!

I’d just clarify the title with perhaps more explicit:

“How to add a keyboard shortcut to insert a code snippet”

and I’d remove the note that it’s my guide, since it’s not, I just helped you find it - instead it’s the best practice to link to the SO answer that you’re copying to give the authors the credits. Thanks.

Daniel · March 23, 2019, 3:14am

thanks, it is done!

Pomo · May 6, 2019, 12:33am

Since starting fastai classes, I have must have typed shape(), len(), and type() a thousand times. What a great feature it would be if the Jupyter notebook automatically told you the type, dimensions, and internal types every time it displays (or assigns) a value. Such a feature would have saved me a lot of time and bugs!

So I made a function that extracts the essential values that we often need to know while writing machine learning code. Here are some examples:

from showType import ShowType
st = ShowType()

a = [[i,i+6]for i in range(6)]
st.type_str(a)
'list[6]<list[2]<int, int>, list[2]<int, int>, list[2]<int, int>, list[2]<int, int>,...>'

npa = np.array(a, dtype=np.float32)
st.type_str(npa)
'ndarray[6, 2]<float32>'

st.type_str(torch.tensor(npa))
'Tensor(cpu)[6, 2]<float32>'

#from Lesson 1 - Pets...
st.type_str(data.one_batch())
'tuple[2]<Tensor(cpu)[64, 3, 224, 224]<float32>, Tensor(cpu)[64]<int64>>'

data = [['1/1/2019', 181, 185.6, 187.3, 180], ['1/2/2019', 185.2, 186.6, 188.3, 182]]
df = pd.DataFrame(data, columns = ['Date', 'Open', 'Close', 'High', 'Low']) 
st.type_str(df)
'DataFrame[2, 5]<Date<object> Open<float64> Close<float64> High<float64> Low<int64> >'

st.type_str(df['Open'])
'Series[2]<float64>'

x = (1, 'wsx', 2.3, ['qaz',7])
st.type_str(x)
'tuple[4]<int, str, float, list[2]<str, int>>'

Ideally I would like this type string to be displayed in addition to the output of every notebook cell. I took a look at the docs for Jupyter extensions. It looks very possible to package into a Jupyter extension - the value of the current cell is accessible, and HTML can format as desired. But I realized that I don’t have the skills or time to figure out how to do it.

If anyone wants to tackle integrating type_str into Jupyter, the code is below for the taking. I, and likely others, would find their code development made easier. Thanks!

import numpy as np
import pandas as pd
import torch

class ShowType():
    # Initial version 20190504 Malcolm McLean

    def __init__(self, width=4):
        self._width = width  #Traverse lists only this far

    def type_str(self, o):  # generate a string that tells us the type and dimensions of o.
        def getLastWord(s):
            ix = s.rfind('.')
            return s if (ix == -1) else s[1 + ix:]

        ts = getLastWord(type(o).__name__)  #The base type name
        rs = ts              # starts the result string

        if hasattr(o, 'shape'):
            if ts == 'Tensor':
                rs += '('+str(o.device)+')'  #Append the device

            rs += str([d for d in o.shape]) #Append the shape dimensions in square brackets

            if ts=='DataFrame':  #List column names and their types
                rs += '<'
                for col, cte in zip(o.columns, o.dtypes):
                    rs += col + '<' + str(cte) + '> '
                rs += '>'
            elif ts=='Series':
                rs = rs+'<'+str(o.dtype)+'>' #Its type
            else:
                rs += '<'+getLastWord(str(o.dtype)) + '>' #If o has a shape, assume elements are homogeneous, append element type.

        elif hasattr(o, '__len__'):
            if (rs == 'str'): return rs  # String has a length but will not be disassembled further

            #o is likely to be a Python tuple or list.
            rs += '['+str(len(o)) + ']' #append the length
            #Show width members of the contents recursively.
            if len(o)>0:
                rs += '<'
                for i,m in enumerate(o):
                    if (i>=self._width): break
                    if i!=0: rs += ', '
                    rs += self.type_str(m)

                if i>=self._width:   #Were there more elements?
                    rs += ',...'
                rs+= '>'            #Close the list of elements

        return rs


# Python Tuple
# -length
# -anything
#
# Python list
#  -length
#  -anything
#
# Python function
#
# Numpy
#  - shape
#  - homogeneous
#  - len yields 1st dimension
#  - a.dtype
#
# PyTorch
#  -shape
#  -homgeneous, t.type()
#  -len
#  -device
#
# Pandas DataFrame
#  -shape
#  -columns,dtypes
#
#  Pandas Series
#  -shape
#  -dtype
#

Edit: Bug fixed for empty list/tuple.

Pomo · May 10, 2019, 10:13pm

Hi all. In reference to the above post, I have been using type_str for a week now, and it has proven to be quite the time and bug saver. As in…

from showType import ShowType
t = ShowType().type_str
t(data.one_batch())

If no one steps up to packaging it into a Jupyter extension, I am going to hire an outside expert to do so. Can anyone here refer me to such a person?

Thanks!

rnfr · September 11, 2019, 10:49pm

Hello, I work through the courses on the train the work each day, so my time for each session is a bit limited. I’ve reached the point where most of my time is spent waiting for the Jupyter Notebook to re-run everything I had been working on prior so that I can get to the same point as I was when I left off.

Is there a way to save my Jupyter Notebook and open it later (after shutting down my Salamander server) without needing to re-run all the prior code I had been using? This would save me 1-2 hours each day to actually work on the course, so thank you in advance for any tips!

crossvalidator · September 21, 2019, 12:33pm

Yes, I like that suggestion - kind of like what R lets you do. Save the state of variables and load them the next time you are back on the task.

I have created a short survey of Jupyter usage. Would be helpful if everyone reading this filled it out.

I will share the results with the community here. I also plan to compile all the suggestions brought up in this thread.

crossvalidator · September 21, 2019, 1:13pm

Here is the compilation of posts in this thread. I will add more to it if I come across more problems posted by users.

dhoa · February 24, 2020, 1:16pm

Hi,
Do we have a way to show a kind of video for a list of images in jupyter notebook ? Similar to opencv.imshow(img), with updated img after some milliseconds ? When working on remote machine I can’t use opencv to show video.
Thanks