Live coding 3

jeremy · June 1, 2022, 6:44am

This topic is for discussion of the 3rd live coding session.

<<< session 2 ｜ session 4 >>>

Recording

Links from the walk thru

Paperspace Gradient Notebooks

What was covered

The $PATH environment variable
Creating and using a conda environment
How conda environments are actually implemented
Creating a Paperspace notebook
Installing pip packages into your home directory
Symlinks
Persistent storage and mounted drives
Opening JupyterLab in Paperspace
Copying SSH keys to Paperspace

Video timeline - thank you @Daniel

00:00 - Catch-up Questions from the last session

Why does Jeremy prefer to have just one base environment?
What advantages of having just one base environment and deleting it and creating a new and updated one when needed?
When does Jeremy occasionally need to create a separate environment?
Why this way is more user-friendly?
Should you install fastai or simply fastbook?

06:11 - settings.ini and fastbook setup (more advanced)

What’s installed when installing fastboot?
Why was sentensepiece a problem? because it was not in fastchan

08:19 - PATH variable

09:09 - why python comes from the specific directory?

What is a path or environment variable?
How to print out the content of the path variable? echo $PATH
Is this a colon-separated string?
What is this long string or multiple strings? addresses of many directories with different packages/libraries stored inside, including the python we are using
Why is the PATH variable so important? As long as the python library is stored in one of the addressed listed inside the Path variable, we can just type python to use it
How to run which python without typing python when the last command was python? which !!
What’s inside the mambaforge directory?
11:12
is the mambaforge directory very similar to mac/linux root directory?
How libraries, .bashrc(or .bash_profile) and PATH variable work together? 11:54

12:22 - Create a new environment

How to create a different environment?
How to create an env named tmp with python<3.10 and fastcore inside? mamba create -n tmp 'python<3.10' fastcore
How to activate/inactivate an env? mamba activate tmp or mamba deactivate
Where does the python come from after you enter this new tmp environment and run which python?
What is inside the newly generated folder envs/tmp of mambaforge directory?
Does this python come from the python libraries stored inside envs/tmp/bin?
Does envs/tmp/lib contain all the libraries of this new environment?
What are hard links? Why it is very nifty? and symlink later 15:30
How to quickly move directly from tmp env to base env? conda activate
Why does Jeremy recommend not freezing particular versions of libraries?

18:27 - Paperspace gradient notebook

Why nice to have things set up locally?
Can we actually do a lot of fastai things without GPU? make predictions, play with small datasets, and others
Why paperspace when we need a GPU? What are paperspace gradient notebooks?

19:46

Why the pro account is a good deal?
Why persistence storage is important? Why each gradient notebook is like a separate server or new computer?

21:19 - How to create a notebook from scratch?

22:01

How to pick on the GPUs for your notebook?
Why select the auto-shutdown?
How to share your notebook?
What is the workspace url? How to fork the fastbook repo through this gradient notebook? #question
Why to keep the paperspace gradient interface paper on? easy GPU shut down
Why copy and paste another paperspace gradient page link?
How to switch to Jupyter notebook interface for the gradient notebook? 25:52
How to get used to Jupyter lab? 28:39
How to maximize the notebook window?
How to switch between tabs? ctrl + shift + [ or ]
How to turn on and off the side bar? ctrl + b
How to remove the status bar? 29:47

33:12 - The python debugger

How to do the notebook debugger?
How to use the non-graphic debugger? 33:12 add %%debug at the top of a cell you want to debug?
How to get help with ipdb? just type and run h
Why not use notebook debug with non-graphic debugger together?
Why and how to use the %%debug with a function? %%debug; f(); 36:44
How to run the next line of code with debugger? n
How to print a content of i? p i
How to repeat the last command? just press enter
How to check the file and where the execution line is? l
How to find out which function made the call of the current line? w
Why the non-graphic debugger is great by Radek?
How to use pdb and set_trace to debug at a specific place of a function? 39:10
How to exit the debugger? q
Why the python debugger is robust and good time investment?

43:08 Install or Upgrade env for the notebook
43:08 - Installing pip packages into your home directory

If you pip/conda install in the terminal, will you have them next time when open this notebook?
What is and where is the persistence storage? from root directory ls /storage
How to check the storage spaces paperspace provide you? df -h
What is the datasets storage? 45:01
Where do we want to install libraries to? storage folder
Which kind of libraries we are safe to use pip to install? python libraries, even with GPU, but not pytorch nor cuda
Do we have access to the root directory of gradient notebooks?
How to update libraries into root/storage directory? pip install -U --user fastcore, thanks to --user
Is the access to root directory problem solved by pip install --user?
Is this the reason why Jeremy prefers pip to mamba and conda? yes
How to check the version of a library like fastcore? import fastcore; fastcore.__version__;

49:21 - Persistent storage, mounted drives, and symlinks

Where does pip install --user store libraries? 49:21 ls -la to see .local folder, and check .local/lib/python3.7/site-packages/
How to make sure .local is still there when we use the notebook next time? 50:02
How to update .local to persistent storage without take much space? symlink
How to create a symlink from .local to storage/? ln -s /storage/.local
How do you know the symlink is created successfully? check the .local again to see .local => storage/.local
How does Jeremy make sure every time he opens a notebook everything is already set up for him?
52:13
Which file of root will run to set up everything for us when start a notebook? /storage/.bash.local
What’s the content of Jeremy’s /storage/.bash.local? 53:05
How do we create our own /storage/.bash.local? or do we just use Jeremy’s? #question
Why we don’t save stuff in the paperspace’s root but use symlink? There is not real root there, things will be gone next time you enter
Can you even symlink SSH key, kaggle key?

56:27 - Paperspace have different python environments by default

How Jeremy solve the fastcore problem in real time? 56:27
How does python find modules or path for libraries? 1:00:30
Where are the places python will search for libraries? import sys; sys.path;
How to find fastcore inside a jupyter notebook (still not working inside python)? import fastcore; fastcore?;
Does import sys; sys.path; give us the same search addresses in python vs ipython environments? No
Finally, what is the problem? the fastcore is installed in a python3.7 env not the python3.9 env
How to use python3.7? python3.7
But why ipython is running python3.7?
How safe it is to upload SSH keys into paperspace? 1:08:15

1:09:34 - Creating a Paperspace notebook with everything set up automatically

How to create a server from scratch? 1:09:34
When Jeremy creates a new notebook is the git directory available automatically? yes, but you need to start a terminal to let it appear 1:12:13 ls; ls /notebooks/;
Why only open terminal then we can see git directory? maybe because Jeremy’s setup is done by /storage/.bash.local which is executed in terminal
Which file does paperspace run to setup everything for us? run.sh which has a pre-run.sh which maybe we should use it instead of /storage/bash.local
How to read the end of a file in terminal? tail /run.sh
How to turn /storage/.bash.local into the pre-run.sh file? mv /storage/.bash.local pre-run.sh
Why paperspace is what Jeremy want to use? paperspace listened and make it the way Jeremy want it to be
How to create a new notebook from scratch step by step? 1:15:01 watch this carefully
How should we use workspace url and what is it for? if you give it your github repo url, the stuff will be ready at /notebooks/ directory when you open this new notebook

1:16:35 - Copying SSH keys to Paperspace to communicate with github

How to setup SSH on paperspace from scratch? 1:16:35
Why do we create SSH for the notebook? How to create a SSH in paperspace terminal?
How to interpret the files and directories created by ssh-keygen? nicely explained
How to upload your local private and public keys to the notebooks/git directory?
How to show all the files starting with id_rsa? mv /notebooks/git/id_rsa + tab to show them all
How to move them all to the current directory? mv /notebooks/git/id_rsa* ./
How to change the permission to make sure no one can read private key? chmod ag-r id_rsa

1:20:26

How to use SSH key to ask github to connect with this notebook? ssh git@github.com 1:21:01

mike.moloch · June 1, 2022, 12:02pm

I really appreciated the discussion on how to use the ‘permanent’ storage that paperspace gives me and how to attach it to any new notebook server that I start.

It was also good to be reminded that even though paperspace calls it “notebook(s)” , these are actually full blown virtual servers. Actually I can recognize that these are docker containers based on the image they publish on their dockerhub repo.

Thank you Jeremy for taking the time to do these walkthroughs!

kurianbenoy · June 2, 2022, 2:03am

I was curious about pricing of permanent storage in paperspace and concept of permanent storage is really interesting . Let me search for it.

EDIT: Storage and datasets | Paperspace

Pricing: https://support.paperspace.com/hc/en-us/articles/360003804333-Storage-Pricing#h_468797348141531099897534

miwojc · June 2, 2022, 4:11am

I wonder if we could also cover at some point how to set up home gpu server to be accessible from outside of local (home) network. I can connect to my gpu sever when at home, but would also be great to have access to it from outside…

jeremy · June 2, 2022, 6:21am

Good idea.

Daniel · June 3, 2022, 4:31am

This is a very rough but detailed note, hopefully it could be some use in searching for info in the video.

Walkthrough 3

00:00 Catch up Questions from last session

Why Jeremy prefer to have just one base environment?
What advantages of having just one base environment and delete it and create a new and updated one when needed?
When does Jeremy occasionally need to create separate environment?
Why this way is more user friendly?
Should you install fastai or simply fastbook?

06:11 settings.ini and fastbook setup (more advanced)

What’re installed when install fastboot?
Why was sentensepiece a problem? because it was not in fastchan

08:19 PATH variable

09:09 why python comes from the specific directory?
What is path or environment variable?
How to print out the content of path variable? echo $PATH
Is this a colon separated string?
What is this long string or multiple strings? addresses of many directories with different packages/libraries stored inside, including the python we are using
Why is PATH variable so important? As long as the python library is stored in one of the addressed listed inside the Path variable, we can just type python to use it
How to run which python without typing python when the last command was python? which !!
What’s inside the mambaforge directory? 11:12
is the mambaforge directory very similar to mac/linux root directory?
How libraries, .bashrc(or .bash_profile) and PATH variable work together? 11:54

12:22 Create a new environment

How to create a different environment?
How to create an env named tmp with python<3.10 and fastcore inside? mamba create tmp 'python<3.10' fastcore
How to activate/inactivate an env? mamba activate tmp or mamba deactivate
Where does the python come from after you enter this new tmp environment and run which python?
What is inside the newly generated folder envs/tmp of mambaforge directory?
Does this python come from the python libraries stored inside envs/tmp/bin?
Does envs/tmp/lib contain all the libraries of this new environment?
What is hard links? Why it is very nifty? and symlink later 15:30
How to quickly move directly from tmp env to base env? conda activate
Why Jeremy recommend not to freeze particular versions of libraries?

18:27 Paperspace gradient notebook

Why nice to have things set up locally?
Can we actually do a lot of fastai things without GPU? make predictions, play with small dataset, and others
Why paperspace when we need a GPU? What is paperspace gradient notebooks? 19:46
Why the pro account is a good deal?
Why persistence storage is important? Why each gradient notebook is like an separate server or new computer? 21:19
How to create a notebook from scratch? 22:01
How to pick on the GPUs for your notebook?
Why select the auto-shutdown?
How to share your notebook?
What is the workspace url? How to fork the fastbook repo through this gradient notebook? #question
Why to keep the paperspace gradient interface paper on? easy GPU shut down
Why copy and paste another paperspace gradient page link?
How to switch to Jupyter notebook interface for the gradient notebook? 25:52
How to get used to Jupyter lab? 28:39
How to maximize the notebook window?
How to switch between tabs? ctrl + shift + [ or ]
How to turn on and off the side bar? ctrl + b
How to remove the status bar? 29:47

33:12 Debugger

How to do the notebook debugger?
How to use the non-graphic debugger? 33:12 add %%debug at the top of a cell you want to debug?
How to get help with ipdb? just type and run h
Why not use notebook debug with non-graphic debugger together?
Why and how to use the %%debug with a function? %%debug; f(); 36:44
How to run the next line of code with debugger? n
How to print a content of i? p i
How to repeat the last command? just press enter
How to check the file and where the execution line is? l
How to find out which function made the call of the current line? w
Why the non-graphic debugger is great by Radek?
How to use pdb and set_trace to debug at a specific place of a function? 39:10
How to exit the debugger? q
Why the python debugger is robust and good time investment?

43:08 Install or Upgrade env for the notebook

If you pip/conda install in the terminal, will you have them next time when open this notebook?
What is and where is the persistence storage? from root directory ls /storage
How to check the storage spaces paperspace provide you? df -h
What is the datasets storage? 45:01
Where do we want to install libraries to? storage folder
Which kind of libraries we are safe to use pip to install? python libraries, even with GPU, but not pytorch nor cuda
Do we have access to the root directory of gradient notebooks?
How to update libraries into root/storage directory? pip install -U --user fastcore, thanks to --user
Is the access to root directory problem solved by pip install --user?
Is this the reason why Jeremy prefers pip to mamba and conda? yes
How to check the version of a library like fastcore? import fastcore; fastcore.__version__;

49:21 Persistent storage and symlink

Where does pip install --user store libraries? 49:21 ls -la to see .local folder, and check .local/lib/python3.7/site-packages/
How to make sure .local is still there when we use the notebook next time? 50:02
How to update .local to persistent storage without take much space? symlink
How to create a symlink from .local to storage/? ln -s /storage/.local
How do you know the symlink is created successfully? check the .local again to see .local => storage/.local
How does Jeremy make sure every time he opens a notebook everything is already set up for him? 52:13
Which file of root will run to set up everything for us when start a notebook? /storage/.bash.local
What’s the content of Jeremy’s /storage/.bash.local? 53:05
How do we create our own /storage/.bash.local? or do we just use Jeremy’s? #question
Why we don’t save stuff in the paperspace’s root but use symlink? There is not real root there, things will be gone next time you enter
Can you even symlink SSH key, kaggle key?

56:27 Paperspace have different python environments by default

How Jeremy solve the fastcore problem in real time? 56:27
How does python find modules or path for libraries? 1:00:30
Where are the places python will search for libraries? import sys; sys.path;
How to find fastcore inside a jupyter notebook (still not working inside python)? import fastcore; fastcore?;
Does import sys; sys.path; give us the same search addresses in python vs ipython environments? No
Finally, what is the problem? the fastcore is installed in a python3.7 env not the python3.9 env
How to use python3.7? python3.7
But why ipython is running python3.7?
How safe it is to upload SSH keys into paperspace? 1:08:15

1:09:34 Creating a Paperspace notebook with everything set up automatically

How to create a server from scratch? 1:09:34
When Jeremy creates a new notebook is the git directory available automatically? yes, but you need to start a terminal to let it appear 1:12:13 ls; ls /notebooks/;
Why only open terminal then we can see git directory? maybe because Jeremy’s setup is done by /storage/.bash.local which is executed in terminal
Which file does paperspace run to setup everything for us? run.sh which has a pre-run.sh which maybe we should use it instead of /storage/bash.local
How to read the end of a file in terminal? tail /run.sh
How to turn /storage/.bash.local into the pre-run.sh file? mv /storage/.bash.local pre-run.sh
Why paperspace is what Jeremy want to use? paperspace listened and make it the way Jeremy want it to be
How to create a new notebook from scratch step by step? 1:15:01 watch this carefully
How should we use workspace url and what is it for? if you give it your github repo url, the stuff will be ready at /notebooks/ directory when you open this new notebook

1:16:35 Copying SSH keys to Paperspace to communicate with github

How to setup SSH on paperspace from scratch? 1:16:35
Why do we create SSH for the notebook? How to create a SSH in paperspace terminal?
How to interpret the files and directories created by ssh-keygen? nicely explained
How to upload your local private and public keys to the notebooks/git directory?
How to show all the files starting with id_rsa? mv /notebooks/git/id_rsa + tab to show them all
How to move them all to the current directory? mv /notebooks/git/id_rsa* ./
How to change the permission to make sure no one can read private key? chmod ag-r id_rsa 1:20:26
How to use SSH key to ask github to connect with this notebook? ssh git@github.com 1:21:01

jeremy · June 3, 2022, 6:25am

Very helpful, thankyou

nikem · June 4, 2022, 3:53pm

In the part that we copy .local to the persistent storage, we then create a symlink that works fine, but I believe the symlink itself is not persistent too. In my case, I stopped the machine and restarted it, then it was not there but there was a brand new .local file.

sym1

after restart :
sym2
Maybe it (symlink) needs to be created in bash.local under persistent storage. Am I right?

jeremy · June 4, 2022, 8:17pm

Yes in .bash.local you need a line to create the symlink.

nikem · June 4, 2022, 10:37pm

I just re-watched the next session (Walk-thru 4) and the solution there (pre-run.sh) is better for me considering my goal which is keeping all notebooks/instances consistent.
Thanks, Jeremy

mdmanurung · June 7, 2022, 3:16pm

Hi all,

Catching up with the walk-thru videos now. Regarding the ssh key, would it be better to upload our key into the /storage folder and then create a symlink in .bash.local or pre-run.sh to link it to the notebook’s .ssh folder?

jeremy · June 7, 2022, 9:43pm

I suggest symlinking the whole .ssh folder like we do in the walkthru.

zymoide1 · June 8, 2022, 6:17pm

I think I might have messed up something pretty severely ( )in Paperspace’s terminal. Following this lecture, I moved the pre-run.sh script (right about here lecture 3)
, and now whenever I try to create a notebook and choose a fast.ai runtime, I get an error when it is trying to set up an image–so I can’t access anything using this option.

In the meantime, can I fire up a server using the first option (PyTorch) and follow the instructions of the 4th lecture?

zymoide1 · June 8, 2022, 6:55pm

Update: I created a new server using the PyTorch option and was able to delete the same pre-run.sh file (that probably caused the issue… not sure what I did there that completely broke the system, but it’s working now.)

mike.moloch · June 8, 2022, 7:05pm

This is interesting (and good to know). So essentially, if you mess up the pre-run.sh file, there’s no way to fix it from within a fastai machine as it tries to run it. So using a pytorch base instance lets one edit it as the pytorch instance doesn’t mess with anything.

It might be something paperspace could add (like a checkbox in the “advanced properties” area which basically “ignore prerun.sh” when checked?)

Would’ve been instructive to know what caused the issue though.

zymoide1 · June 8, 2022, 7:46pm

Yeah, that sounds about right. I wish I knew what the error was… when I’m firing up a new instance, I can’t even see the storage folders no more (on the left pane). I think I need to rewatch the 4th lecture for the 3rd time and start from there… maybe even the 3rd walkthru. Is there any written tutorial anyone had set up I can follow so I’ll know I haven’t messed anything up? Thanks Mike.

jeremy · June 8, 2022, 10:20pm

If you join the call today in 100 mins time you could share your screen and we could debug together, if you like.

zymoide1 · June 8, 2022, 11:20pm

I’m now (finally) starting the 6th walkthrus after watching the 3rd,4th, and 5th for 2-3 times each, but I’m glad to report that everything is working properly thus far. I hope to join tomorrow and finally be able to be live with the rest of the class. Thank you!!!

bencoman · June 11, 2022, 3:26am

At 28:45 its hard to see what was clicked on to make it fullscreen since hidden behind attendees.

Managed to work it out, so here it is for others…

Exit full screen using F11, or hovering mouse at top edge of screen.

bencoman · June 11, 2022, 4:31am

Some notes, so I don’t need to search for it in the video to remember.

set_trace

At 39:00…

from pdb import set_trace
def f():
    for i in range(1):
        set_trace()
        print(i)

f()

Starting a paperspace terminal

On first runthrough, I missed the point that Jeremy launched the terminal.
I had to hunt back to find it at 29:10, where the tabs went from…

to…

by clicking “Terminal” in the “Launcher tab”…

Symlinking paperspace .local directory for use with ``pip --user``

At [47:30], when you don’t have root access or root filesystem is not persistant,
use pip --user to install into home directory with ~/.local/ symlinked into persistant /storage directory.

$ cd && pwd
/root
$ mv  .local  /storage/
$ ln -s  /storage/.local
$ ls -ld  .local
lrwxrwxrwx 1 root root 16 Jun 11 04:14 /root/.local -> /storage/.local/

Customise paperspace with pre-run.sh

This is a copy of Jeremy’s file (originally file /storage/.bash.local discussed at [52:32] and later converted to file /storage/pre-run.sh at [1:11:42].

NOTE: This is for template reference only. Don’t use blindly. Each dot-folder probably needs to be independently understood and set up.

#!/usr/bin/env bash

if [ ! -e ~/.config ]; then
    cd
    rm -rf .config .fastai .jupyter .local .ssh .bash_history
    for p in .local .ssh .config .ipython .fastai .jupyter .git-credentials .gitconfig .bash_history conda .kaggle 
    do
        ln -s /storage/cfg/$p
    done
fi
if [ ! -e /notebooks/git ]; then
    ln -s /storage/cfg/torch ~/.cache/
    ln -s /storage/cfg/huggingface ~/.cache/
fi
if [ ! -e /notebooks/git ]; then
    ln -s /storage/git  /notebooks/
fi

export PATH=~/.local/bin:~/conda/bin:$PATH
export MAMBA_ROOT_PREFIX=~/conda

Debugging difference between notebook and command-line imports

Discussed at [57:30].
Note, notebook kernel may need to be restarted to pick “.local” in the search path

! pip install -U --user fastcore
import sys
print(*sys.path, sep="\n")
import fastcore
fastcore?

Adding SSH keys to paperspace

Discussed at [1:15:36].
After uploading id_rsa and id_rsa.pub into root notebooks folder,
which in the terminal is directory /notebooks

$ cd
$ rm .ssh
$ ssh-keygen   #to generate sample keys to see required permissions
$ cd .ssh
$ ls -la   #review permissions
$ mv /notebooks/id_rsa* .
$ ls -la   #see that permissions are wrong 
$ chmod 600 id_rsa
$ ls -la    #confirm permissions corrected
$ ssh -T git@github.com

P.S. Jeremy asks: “Why is it chmod, not chperm?”
its from… change file mode bits