Live coding 8

This topic is for discussion of the 8th live coding session

<<< session 7session 9 >>>

Links from the walk-thru

What was covered

  • (please contribute here)

Video timeline - thank you @Mattr

05:10 - Running a kaggle notebook on your local computer
09:10 - Setting up to run on your own GPU server
15:40 - Get back to where we left off in walkthru7
16:00 - Get file sizes the slow way
17:00 - Using parallel processing to speed things up
23:00 - Selecting a different image model from timm
29:00 - Start fine tuning model
30:50 - Description of fine_tune
35:50 - Discussion of fit one cycle
48:30 - Applying fine tuned model to test set
49:00 - Reviewing docs for test data loader for inference
52:00 - Preparing file for kaggle submission
1:00:00 - Visually check results and submission file
1:02:10 - Submit entry to kaggle from command line
1:06:00 - Check leaderboard on kaggle site for problems
1:07:00 - Fix order of results file and resubmit
1:10:00 - Questions

9 Likes

Ok so it’s Dynamic DNS service i need to connect to GPU server from outside my home network. Thank you for this lesson Jeremy!

4 Likes

I have not known about any of the ! commands before, the whole bringing to background/foreground thing also seemed a little bit remote to me even though I technically knew about it, but now that I started using all this moving around in bash feels so much nicer :slight_smile:

Dunno what the effect is, as I genuinely did know about pushd for example, but it took being part of the walk-thrus to actually start using all this :thinking: Awesome and a big thanks! :smile:

Inspired by all the installing of stuff and changing the PATH environment variable, (and symlinking!) I wrote this small program for running docker, it is super useful to me

#!/usr/bin/env python3

import argparse
import subprocess

parser = argparse.ArgumentParser()
parser.add_argument('-i', '--image', help='name of the docker image to run', type=str, default='nvcr.io/nvidia/merlin/merlin-pytorch-training:22.05')
parser.add_argument('-c', '--cmd', help='cmd to run', type=str, default='jupyter notebook --ip=0.0.0.0 --allow-root --no-browser --NotebookApp.token=""')
parser.add_argument('-np', '--no-port-forwarding', help='forward port 8888', action='store_true')

args = parser.parse_args()

docker_cmd = ['docker run -it --gpus all --ipc=host']
if args.no_port_forwarding is False:
    docker_cmd.append('-p 8888:8888')
docker_cmd.append('--mount src=`pwd`,target=/workspace,type=bind --ulimit memlock=-1 --ulimit stack=67108864 --entrypoint="" -w /workspace')
docker_cmd += [args.image, args.cmd]

subprocess.call(' '.join(docker_cmd), shell=True)

I initially tried to do all this in bash (by defining a function), but this very changed into a painful ordeal (as the logic was becoming more complex). Turns out python is quite handy for such a thing :slight_smile:

I am not sure if that is the “correct” way to use it, but I keep this code somewhere in my homedir where I can easily edit it, have made it executable and have linked to it from /usr/local/bin. Not sure if that is where we should put our own stuff? Probably!

One mystery I continue to be faced with is how come in a docker container I cannot do pip install -e . on a local repo? I mean, the command runs but docker still imports the library that was preinstalled there :man_shrugging:

But one cool thing I stumbled upon if I do the following at the top of my notebook:

os.chdir('/workspace')

and workspace is where the code for the library I would like to work on lives, somehow, that code gets imported instead of what is installed.

But… how? Why? So what is pip install -e . doing? Now this will be a stupid question… but for all the python packages in the universe, can I always do this? Be at the root of the repo where they are defined, where I could do pip installed -e . and simply import them “directly”? What magic is this?!

If that would indeed work with all Python code, that would be quite neat because… I could then always use pdb.set_trace to step through any Python code in the entire known universe without doing pip install -e ., would just need to change the workdir using os.chdir.

Well, figuring out these pieces was fun, now I actually have to figure out what the code I wanted to learn about does :smile: That will be aaaaa long process…

1 Like

This is indeed really cool, I will certantly use it. Two questions:

  • Why not use the -v option instead of mount?
  • I would add the --rm option so the container get’s killed.
1 Like

Good point on the --rm, haven’t thought of that! (and I end up running docker container prune every now and then but that is because when I type in the command by hand I don’t feel like typing the additional --rm, it always slips my mind, but with this living in a file, there is no excuse now :slight_smile: )

I am not sure about the -v vs --mount, do you know how it differs? I tried reading about it in the docs some time ago but my research was inconclusive :slight_smile:

I have always used -v, (it is like a shortcut to --mount, type=bind) don’t know…
I found this on the differences

2 Likes

I’ve always used -v but looking at the docs you linked, it seems I should be using bind instead. One thing that I found is that for my local docker container, I have a notebooks directory which I map to /notebooks using -v , but when I make changes to things from within the docker container (like creating a new folder or a new .ipynb file) the permissions on those files are root/root instead of being the ones for the userid that I use on the host system)

Video timestamps for walk thru 8

05:10 - Running a kaggle notebook on your local computer
09:10 - Setting up to run on your own GPU server
15:40 - Get back to where we left off in walkthru7
16:00 - Get file sizes the slow way
17:00 - Using parallel processing to speed things up
23:00 - Selecting a different image model from timm
29:00 - Start fine tuning model
30:50 - Description of fine_tune
35:50 - Discussion of fit one cycle
48:30 - Applying fine tuned model to test set
49:00 - Reviewing docs for test data loader for inference
52:00 - Preparing file for kaggle submission
1:00:00 - Visually check results and submission file
1:02:10 - Submit entry to kaggle from command line
1:06:00 - Check leaderboard on kaggle site for problems
1:07:00 - Fix order of results file and resubmit
1:10:00 - Questions

10 Likes

Some additional steps I followed that were not covered in this walk thru because Jeremy wasn’t working from a Paperspace notebook include:

  • Moving and symlinking the .kaggle folder into storage so I didn’t have to upload the API key each time
  • Restarting the Jupyter kernel after pip installing timm
  • Running the get_data.sh file to automatically download data (covered in walk thru 7)

Things I would like to cover are:

  • How to make the timm install persistent? We covered creating a persistent conda environment but I’m not sure how to do the same for pip packages. Having to restart the kernel to use it each time is a bit of pain that might be a trap for others.
  • Also what are some recommendations about saving and loading models with Paperspace efficiently. After spending a certain amount of energy and time to fine tune a model where best to save the fine tuned model? Using learn.save('paddy_conv') the model was saved to Path('train_images/models/paddy_conv.pth') which doesn’t feel like the best place for it and my model file is quite large ~ 576Mb.

Navigation in Jupyter notebooks is another thing I’d like to improve my skills in. It’s nice that there is some overlap with vim navigation keys. Slowly building the muscle memory. Using Ctrl-Delete to merge a cell with the cell above just doesn’t yet feel right to me.

One thing I would love to discover in any IDE is a way to move the cursor from within quotes like this “xxxx|” to outside of the quotes like this “xxxx”| without using the right arrow which is too far from home position on the keyboard. It’s a small thing but maybe there is solution?

2 Likes

Is there a way to make show_batch() to display an image from all vocab classes?

This should already be persistent due to linking the ~/.local path, but ask me in the session and we can chat about it.

1 Like

Nope sorry! Nice idea though - might be a good exercise to try writing it.

2 Likes

In vim it’s f"l to do that. Ask me in class and I’ll show you.

5 Likes

This question is perhaps related to Matt’s question regarding problems with importing timm. This could also be related to Walkthrough 3 56:27-1:09:34.
I checked and noticed that ipython uses /root/conda/bin/python Python 3.10 (both timm and fastbook are in root/.local/lib/python3.10/site-packages/). It looks like /root/conda/bin/python can access the packages installed with pip --user.

However, when I launch a Python3 (ipykernel) Notebook, I cannot import timm or fastbook.

. It looks like it uses /opt/conda/bin/python 3.7.
I’m not exactly sure why some things are in the root directory while others are in the opt directory, as well as how to untangle the two. Perhaps this has something to do with Paperspace’s Docker image or something along those lines, but I don’t know enough to tell.

I do not believe the bash path to be important for this, but just in case, here it is:

I just spent a few hours going over walkthrough videos 1-6 and tried to make sure that I’ve followed the suggestions correctly, but have not found a fix for this yet. It would be great to fix this since it’s a major stumbling block to being able to use Paperspace. Has anyone else encountered this problem and found a way to fix this or a way around it?

1 Like

You don’t want any copies of python or ipython in your homedir. Delete them.

1 Like

Beautiful, thank you so much, Jeremy! Everything works now! I think I remember you removing these while tackling ctags, but I must have tried doing that part a few times to get it to work and probably forgot to remove python on the last successful iteration.

2 Likes

This looks neat, but can I suggest that perhaps it’s time you slowly join in the docker-compose cult. :sweat_smile: Compose was built exactly for tweaking all these docker params & more.

Off the top of my head, when you run a pip install --editable ., what happens is that a link is created at site-packages for the folder you’re in (usually a file called XXXXX-egg.link).

import site
site.getsitepackages()
# search for the egg link file in the folder and check the contents

import your-package
your-package.__file__
# this points to where your package module is actually located

Now, site-packages is where your Python interpreter searches for packages when you’re running imports. So, editable installs are really a way of installing the packages via links(conceptually like symlinks) instead of actual hard/final copies. If you try to connect the dots, this is mainly done for development purposes, but sometimes can also be abused in production etc. envs.

For more internal details on how/what exactly happens, this YT video is quite good too.
https://www.youtube.com/watch?v=gYYi7varbmE

Now, on to this question, this tricks a lot of Docker users for a good reason, it’s a bit messy. It usually boils down to volume mounts things. So, let’s say you’ve done an editable install during the docker build phase, at this point the editable install (hence the link) points to the correct folder etc. But, when running the container most users volume mount the package folder to some other folder on the container. Now, if the locations for the link (that happen during build) and the path for the folder that’s now mounted don’t match, then python is obv. going to pick up whatever was installed. Your changes are not overriding the exact same location to where things are installed.

Two ways to go about “fixing” this.

  • Match the exact folder paths of where you did the editable install from (you can see that in the egg file to debug) and where you eventually mount the package path to
    OR
  • Run the editable install only after you’ve done your volume mounts, the container is up and running etc. (eg. from a notebook cell)

The second one probably is lot smoother (and explicit) for notebook based work. Depends on what you prefer really. But yeah, the trick to debug if the editable egg-link is pointing to the exact location where the code edits/changes are happening.

Hopefully, that clarifies things a bit. Let me know otherwise. :raised_hands:

2 Likes

--mount is more explicit (allows more configuration) compared to -v, otherwise they’re pretty much the same. Official docs. here.
https://docs.docker.com/storage/volumes/#choose-the–v-or—mount-flag

EDIT : Forgot to mention, there’s another major difference with bind mounts(mapping host dir to container dir) vs. actually creating docker volumes(allows for persisting stuff by a folder managed by docker itself outside of your working directory) vs. using temp filesystem. That’s where things can become a bit confusing.

1 Like

Ah buddy, I have a fix for this(that I use for development work), but it’s fairly involved with a custom entrypoint, picking up host uid:gid, passing it to the container startup entrypoint & switching user before starting the main process. This(transparently using same uid:gid from host) is not super straightforward in Docker land.

There’s couple of ways to go about it, but if chmod-ing on host every now and then is not a pain, then I wouldn’t bother with some much extra stuff.

1 Like

Love all the attention vim is getting in the walkthroughs. :muscle: Good to know that I’m not the only one who hasn’t fully moved over to vscode yet.

1 Like