Live coding 4

This topic is for discussion of the 4th live coding session.

<<< session 3session 5 >>>

Important note

During this session we create a pre-run.sh file which has a bug in it. To fix the bug, you need to add the following line to the end of it:

cd

If you don’t add that line, you won’t be able to see your existing notebooks any more! See the start of the next video for details.

More importantly: since we recorded this, we’ve now automated the setup process, so you can skip nearly everything in this session! To set up your paperspace environment, follow the instructions here:

What was covered

  • Setting up a paperspace server from scratch
  • Paperspace persistent storage details
  • pip vs conda/mamba
  • Creating a new bash script
  • #! script headers
  • chmod permissions / octal masks
  • Uploading and testing existing ssh keys
  • How pre-run.sh works

Video timeline - thank you @Daniel

00:00 - Create a total empty notebook

  • How to be lazy and a great programmer? 03:02

04:13 - Create an empty notebook and symlink from persistence storage

  • Why open a new window to use jupyter lab? to keep paperspace interface for shutting down when finished
  • Why should you read paperspace docs? What did Jeremy find out? 05:16
  • What are the interesting folders inside the root directory?
  • How is storage folder different from notebooks directory? Why both of them exist for good reason? 06:32
  • Should we worry about using pip install when paperspace uses conda a lot? 08:02
  • How to pip upgrade packages into the home directory with --user? pip install -U --user fastcore 13:14
  • What folder we want to be there next time when we open notebook? .local/
  • How to save this .local/ into persistence storage? mv .local /storage/
  • How to delete/clear everything left to the cursor? ctrl + u
  • How to delete/clear everything right to the cursor? ctrl + k
  • But it is better to save .local to /storage/cfg, how to do it in the lazy way? mkdir /storage/cfg; mv .local !$
  • How to create a symlink from the persistence storage to this notebook? ln -s /storage/cfg/.local/ 16:04
  • How to check whether this symlink is created? ls -la to see the changes
  • when create a new notebook, don’t forget to choose the fastai image and choose advance option to remove the git repo for a clean notebook 18:40

19:24 - Create pre-run.sh from scratch to automate .local symlink from storage

  • How to create a python file to setup the symlink first before running jupyter lab so that when we do pip install -U --user the packages will automatically link to storage/cfg/.local? 19:24
  • Does Jeremy think paperspace is the way to have easy to use GPU for fastai in cloud? 21:32 yes
  • How to make a symlink from /storage/ back to this notebook’s /notebooks directory? cd /notebooks/; ln -s /storage/
  • What does this step above do? to link the /storage/ folder back into /notebooks/ folder
  • Why to access the /storage/ folder inside notebooks is useful? so that we can create and edit new files inside notebooks with jupyter lab
  • How to create a text file, edit it and save it inside /storage/ with jupyter lab? 24:53
  • What shall we put inside this text file?
  • #!/urs/bin/env bash: run the following script in bash
  • cd: go to the home directory
  • ln -s /storage/cfg/.local/: symlink the persistence storage .local folder to this notebook’s home directory
  • How to find where the bash is? which bash may give us /bin/bash
  • Which name should we name this text file? pre-run.sh
  • Why we can’t run ./pre-run.sh directly? 27:57 execution permission is needed, ls -la can show us the permission status
  • What is the usual way of setting execution permission? chmod u+x pre-run.sh
  • How Jeremy set permission? chmod 744 pre-run.sh and what does 7, 4, 4 mean each? 29:47
  • How to check the status again and run the .sh file again? ls -la; ./pre-run.sh;
  • How to ensure there won’t be any .local folder inside a notebook when it opens next time? add rm -rf .local into pre-run.sh, so now the file looks like below
#!/urs/bin/env bash

cd

rm -rf .local

ln -s /storage/cfg/.local
  • Let’s create a new notebook to see whether it has the .local linked from the storage? 32:28

33:15 Create SSH keys from scratch and automate .ssh symlink from storage with pre-run.sh

  • First, how to create a .ssh folder to save keys? 33:15 mkdir .ssh
  • How to upload the keys from local computer to the notebook with gui? loaded into /storage/
  • How to move them back into ./.ssh folder? cd .ssh; mv /storage/id_rsa* ./
  • How to check whether the SSH keys are in proper permission? ls -la
  • How to only allow user to have all the rights to the entire .ssh directory ? 34:24 chmod 700 .
  • How to allow user to only read and write on the private key? chmod 600 id_rsa
  • How to allow user to read and write and everyone else only read the public key? chmod 644 id_rsa.pub
  • How to connect the notebook with github using the ssh key? inside .ssh directory, run ssh git@github.com and type yes to continue to see the connection successfully
  • How to see the connection process in detail? ssh -vvv git@github.com
  • How to symlink .ssh folder from /storage/cfg/.ssh? 36:44
  • first, mv .ssh /storage/cfg/;
  • update the pre-run.sh as the following
#!/usr/bin/env bash

cd

rm -rf .local

ln -s /storage/cfg/.local

rm -rf .ssh

ln -s /storage/cfg/.ssh
  • run ./pre-run.sh and check the symlink by ls -la
  • check the ssh and connection by ssh git@github.com
  • great, pre-run.sh works on newly created notebooks 38:02
  • Why paperspace system (run.sh) will run pre-run.sh? 40:03 because they listen to Jeremy
7 Likes

I am not sure if anyone has written this down already somewhere on the forums? If not, here it is

!! - execute last command
!<something> - execute the first command in history (searching from presence backward) that starts with

For examples:

$ echo 'hi'
'hi'
$ ls -a
.bashrc .vimrc
$ !e
echo 'hi'
'hi'

Now the !$ is a bit tricky. I think it refers to the last argument to the most recent command

image

Not sure if I got it right :thinking: No chance to google these ones though :smile: Watching the walk-thru our only recourse!

3 Likes

Jeremy I am not sure of the validity of what you do with you ssh keys. The concept of root is a little confusing here, on an individuals computer the concept is pretty straight forward the individual has root permissions on a remote server who actually is root the individuals maintaining that server environment have root privileges on all the servers in that environment I believe. I flag this here as i am not a professional System Administrator, will try to look into further but the advice of a SysAdmin person at U may be prudent.

It’s Elizabeth II’s 70th Jubillee celebrations here so I am honor bound to have a few drinks and party.

HI RogerS49 , can you please reference the video timestamp for the ssh keys that you mentioned?

Also, I’m here in Canada raising a cup of Chai Tea to the Queen’s health on her Platinum Jubilee :smiley:

I got things working as planned and as taught in the lesson. But I’ve noticed that since creating the pre-run.sh file (which is persistent across servers) that my other servers are bit screwed up now.

I had previously created two other Paperspace servers using git clone of the fastai repo, one directly from the fastai repo and one from a fork on my own github account. Those notebooks are now not showing up as files in Jupyter Lab on any server. For example, here is what the Gradient server looks like prior to starting the virutal machine:

The nbs are all listed.

Once I start the machine, the list of files in Jupyter is empty:

Empty Jupyter Directory

The nbs are there in /notebooks in the terminal:

I recall that Paperspace opens Jupyter in the /notebooks directory by default. If enter !pwd I can see that I am in /root. I found this way using cell magic to change the working directory:

But the notebooks still don’t show up in Jupyter:

No files in Jupyter

Questions:

  1. Why did the Jupyter root directory change?
  2. How can I access the notebooks in Jupyter?
  3. Can I change the default directory back to notebooks?

P.S. I entertained the idea that it had something to do with the symlink that we created to /storage in /notebooks last night, but as Jerermy said (and I have confirmed), the /notebook drive does not persist across servers.

1 Like

Thanks for pointing this out. I think it can be useful in situations when you sort of work out a command using non-destructive commands and then want to give it a ‘destructive’ command.

As in the example below, I’m using ‘ls’ to see if the file removeme.txt is there, then I just take that argument to ls command and pass it to rm command to get rid of it. The last ‘ls’ checks to see if it’s gone by using the argument to the last command which happens to be the ‘rm’ command.

But honestly though, I try to stay away from these fancy globbing things when doing anything like rm because if I happen to forget, the results can be catastrophic.

(fastai2022) mikemoloch@moloch titanic % touch removeme.txt
(fastai2022) mikemoloch@moloch titanic % ls removeme*
removeme.txt
(fastai2022) mikemoloch@moloch titanic % rm !$
rm removeme*
(fastai2022) mikemoloch@moloch titanic % ls !$
ls removeme*
zsh: no matches found: removeme*

Try 33:16

1 Like

Added:

I created access by making a symlink in home directory to /notebooks. I suppose I could add this to pre-run.sh and it will be automatic at boot. Is this the best way. Would still like to know why this happened.

This is the stuff that drives me crazy and consumes a lot of time!

Another question from last night:

If we deleted the mamabaforge directory for a fresh install as Jeremy suggests, then will this also remove the pip installed modules, which are stored in /storage/.local? I would expect we’d have to delete those as well?

You can also find it in walkthru 3
1:15:35 about SSH keys
1:17:25 about SSH keygen

1 Like

Oops - that’s because we put “cd” in our script!

To fix, put “cd /notebooks” at the end of the script.

1 Like

Sorry I don’t understand you question, or what you’re not sure about here?

Exactly right!

2 Likes

Not sure what I am saying myself as the field of security is so complex and even if there is a problem. But my gut feel is storing a private key on a cloud server is a security risk. On your client box you are the admin of the whole system. On a cloud server you are admin of your slice of the cloud but there are admin at a level above you that have greater privileges than yourself on the cloud server. Any way still not sure what I am saying here. I did this search

Search storing a private key on a cloud server

I would have thought that to access the git hub server for git clone etc would be to repeat what you did on your client box to access git server direct, but think that impracticable because each time you login to the cloud you would never be on the same server as last time as the physical address of the sender/cloud_client would be different. I am sorry this gets so complicated to explain if your not a security expert which I am not. Anyway I just feel there may be securer ways to do remote calling from a cloud server. I appreciate this is off topic from the main gist of these walk throughs perhaps should be in a separate discussion or not if this is just pie in the sky

We discussed that a bit in the walk-thru. There are 3 levels of security you can choose from:

  • Use a separate GitHub Personal Access Token (PAT) for the machine. This is most secure, since you can define the exact permissions it has
  • Use a separate SSH key pair for the machine. This is less secure, since it has access to your whole GitHub account
  • Use a single SSH key pair across all your machines. This is the least secure, since it has access to your GitHub account and any machines you have that are accessible over SSH.

Hopefully that helps explain your options a bit better.

5 Likes

Here is a rough but detailed note which may help search in the video.

Walkthrough 4

forum thread

00:00 Create a total empty notebook

  • How to be lazy and a great programmer? 03:02

04:13 Create an empty notebook and symlink from persistence storage

  • Why open a new window to use jupyter lab? to keep paperspace interface for shutting down when finished

  • Why should you read paperspace docs? What did Jeremy find out? 05:16

  • What are the interesting folders inside the root directory?

  • How is storage folder different from notebooks directory? Why both of them exist for good reason? 06:32

  • Should we worry about using pip install when paperspace uses conda a lot? 08:02

  • How to pip upgrade packages into the home directory with --user? pip install -U --user fastcore 13:14

  • What folder we want to be there next time when we open notebook? .local/

  • How to save this .local/ into persistence storage? mv .local /storage/

  • How to delete/clear everything left to the cursor? ctrl + u

  • How to delete/clear everything right to the cursor? ctrl + k

  • But it is better to save .local to /storage/cfg, how to do it in the lazy way? mkdir /storage/cfg; mv .local !$

  • How to create a symlink from the persistence storage to this notebook? ln -s /storage/cfg/.local/ 16:04

  • How to check whether this symlink is created? ls -la to see the changes

  • when create a new notebook, don’t forget to choose the fastai image and choose advance option to remove the git repo for a clean notebook 18:40

19:24 Create pre-run.sh from scratch to automate .local symlink from storage

  • How to create a python file to setup the symlink first before running jupyter lab so that when we do pip install -U --user the packages will automatically link to storage/cfg/.local? 19:24

  • Does Jeremy think paperspace is the way to have easy to use GPU for fastai in cloud? 21:32 yes

  • How to make a symlink from /storage/ back to this notebook’s /notebooks directory? cd /notebooks/; ln -s /storage/

  • What does this step above do? to link the /storage/ folder back into /notebooks/ folder

  • Why to access the /storage/ folder inside notebooks is useful? so that we can create and edit new files inside notebooks with jupyter lab

  • How to create a text file, edit it and save it inside /storage/ with jupyter lab? 24:53

  • What shall we put inside this text file?

  • #!/urs/bin/env bash: run the following script in bash

  • cd: go to the home directory

  • ln -s /storage/cfg/.local/: symlink the persistence storage .local folder to this notebook’s home directory

  • How to find where the bash is? which bash may give us /bin/bash

  • Which name should we name this text file? pre-run.sh

  • Why we can’t run ./pre-run.sh directly? 27:57 execution permission is needed, ls -la can show us the permission status

  • What is the usual way of setting execution permission? chmod u+x pre-run.sh

  • How Jeremy set permission? chmod 744 pre-run.sh and what does 7, 4, 4 mean each? 29:47

  • How to check the status again and run the .sh file again? ls -la; ./pre-run.sh;

  • How to ensure there won’t be any .local folder inside a notebook when it opens next time? add rm -rf .local into pre-run.sh, so now the file looks like below

    #!/urs/bin/env bash
    
    cd
    
    rm -rf .local
    
    ln -s /storage/cfg/.local
    
  • Let’s create a new notebook to see whether it has the .local linked from the storage? 32:28

33:15 Create SSH keys from scratch and automate .ssh symlink from storage with pre-run.sh

  • First, how to create a .ssh folder to save keys? 33:15 mkdir .ssh

  • How to upload the keys from local computer to the notebook with gui? loaded into /storage/

  • How to move them back into ./.ssh folder? cd .ssh; mv /storage/id_rsa* ./

  • How to check whether the SSH keys are in proper permission? ls -la

  • How to only allow user to have all the rights to the entire .ssh directory ? 34:24 chmod 700 .

  • How to allow user to only read and write on the private key? chmod 600 id_rsa

  • How to allow user to read and write and everyone else only read the public key? chmod 644 id_rsa.pub

  • How to connect the notebook with github using the ssh key? inside .ssh directory, run ssh git@github.com and type yes to continue to see the connection successfully

  • How to see the connection process in detail? ssh -vvv git@github.com

  • How to symlink .ssh folder from /storage/cfg/.ssh? 36:44

  • first, mv .ssh /storage/cfg/;

  • update the pre-run.sh as the following

    #!/usr/bin/env bash
    
    cd
    
    rm -rf .local
    
    ln -s /storage/cfg/.local
    
    rm -rf .ssh
    
    ln -s /storage/cfg/.ssh
    
  • run ./pre-run.sh and check the symlink by ls -la

  • check the ssh and connection by ssh git@github.com

  • great, pre-run.sh works on newly created notebooks 38:02

  • Why paperspace system (run.sh) will run pre-run.sh? 40:03 because they listen to Jeremy

7 Likes

New test set to learn newly vim skills :pray:

1 Like

Hello all, there is a note on the gradient docs saying that the free tier notebooks are public:

https://docs.paperspace.com/gradient/machines/

Public:

  • On free subscriptions, notebooks with free machines will always be set to public. Upgrade to Pro/Growth if you would like to set free machines to private.*

I do not know what that means exactly and if it was a security risk considering some of us may have put a private key on the server.
any idea?

It’s just read access to the notebooks, not access to the terminal.

2 Likes

Ctrl + Z, fg, pushd, popd is super helpful.

I am also learning new vim tricks every day now.

I have seen all walkthrus now. I am also interested in knowing the process of creating requirements.txt and environment YAML files so trimmed.

Thanks for doing this Jeremy.