Live coding 11

jeremy · June 15, 2022, 2:01am

This topic is for discussion of the 11th live coding session

<<< session 10 ｜ session 12 >>>

Recording

Links from the walk-thru

What was covered

On Becoming a Good Deep Learning Practitioner
Weights and Biases for managing large experiments
The human optimiser vs brute force approach
- Optimizing hyperparams for image datasets in fastai
Managing experiments with notebooks

(please contribute here)

Video timeline by @daniel and @mattr

00:00 - Recap on Paddy Competition
04:30 - Tips on getting votes for Kaggle notebooks
07:30 - Gist uploading question
10:30 - Weights and Biases Sweep
14:40 - Tracking GPU metrics
16:40 - fastgpu
20:00 - Using gitconfig
21:00 - Analysis notebook
26:00 - Parallel coordinates chart on wandb
31:30 - Brute force hyperparameter optimisation vs human approach
37:30 - Learning rate finder
40:00 - Debugging port issues with ps
42:00 - Background sessions in tmux
46:20 - Strategy for iterating between notebooks
49:00 - Cell All Output toggle for overview
50:50 - Final transform for vit models
52:05 - swinv2 fixed resolution models
53:00 - Building an ensemble - appending predictions
55:50 - Model stacking
57:00 - Keeping track of submission notebooks

miwojc · June 15, 2022, 4:35am

would it be possible to use something like linear regression to figure out weights for model ensemble ?

miwojc · June 15, 2022, 4:54am

kaggle is another system for ‘experiment tracking’ in a way. when you save and run all (commit) it will save notebook with outputs, input data, output data under a version, so you can come back and recheck

btw, just noticed when you run out of gpu quota on kaggle and have like minutes left in your weekly quota, you can still start the save and run all and get additional time (up to 9 hours - kernel run time) .

Moody · June 15, 2022, 5:43am

Congratulations to @radek! First gold medal as a notebooks contributor.

Thank you for all your contribution to the fast.ai community. I learn a lot from you.

mike.moloch · June 15, 2022, 10:31am

Regarding finding out about processes that are holding a port (say, 8889), I use this:

On Linux:

netstat -ano | grep 8889

On macOS, there is no -o option, but -v can be used. We’re trying to get it to list the port numbers.

netstat -van | grep 8889

If you don’t know the port you can also grep for a substring in the process name to filter out just the lines containing that substring. (e.g., grep python or grep -i jupyter note that the -i option makes the grep case insensitive.)

P.S. On a paperspace instance you’ll need to install netstat first. On some systems you may need to use sudo depending on how locked down the network binaries are on a given system

radek · June 15, 2022, 11:07am

Oh wow, Sarada, this is wonderful! My first ever Kaggle Notebook gold! Earned with the help of our amazing community! It couldn’t get better than this!

Thank you so very much for making this happen and my heartfelt thank you to everyone who contributed their votes

And thank you so much for your very kind words! I continue to learn a lot from everyone here. It was wonderful to see you today in the walk-through today, Sarada!!!

BTW what surprises me quite a bit is the number of medals in discussions. I can barely remember talking online that much But oh well, if there is a record to prove it, I guess it must have happened

Thank you so very much again everyone for this wonderful surprise!

Mattr · June 15, 2022, 7:07pm

Something peculiar I noticed on Paperspace today is that I was getting a very long error when creating the data loader with the same exact code that worked on a different machine yesterday.

I tried restarting the kernel but got the same error. Then I stopped the machine (P5000) and started a new RTX4000 machine and ran the same exact notebook and the same code ran without error. No idea why but seems there a problem somewhere.

gsg · June 15, 2022, 8:07pm

i had the same error… by reducing the size (in a hunch) of the aug_transforms, that error went away….(both on P5000)

Mattr · June 15, 2022, 10:19pm

There are 19 people on the same score as me in the Paddy competition now. Is the order arbitrary or in order of submission when scores are the same?

Moody · June 16, 2022, 4:14am

The order looks random to me. You are doing a good job.

miwojc · June 16, 2022, 4:15am

I always thought it’s ordered by submission time if score is same …

miwojc · June 16, 2022, 4:16am

The time here could be of the last submission which not necessarily is the best one i think…
Edit: i meant that if you submit a solution kaggle records a time of it, if it’s your the best scoe, the score and hopefully position gets updated on leaderboard. However if score is lower than best, only submission time is recorded.

nikem · June 16, 2022, 6:59am

Ok, I had the same on my local computer. Could you check my full trace and compare it to yours?
Mine disappears when I run the same line again. It is weird.

Mattr · June 16, 2022, 9:21am

Is there are generally accepted description for the type of multiple target model you started to build today Jeremy? It’s not multi-modal because we are only using images for training. Would this be correctly referred to as a multi-head model?

jeremy · June 16, 2022, 9:54am

I’m not sure I’ve really heard them being given a name before! That description you linked to seems close enough, however.

Although as you’ll see tomorrow I’ve come up with a much simpler approach…

bilalUWE · June 16, 2022, 8:39pm

How can we get the sweep ids from the wandb experimentation? I’m running sweep.py to generate the data for the analysis.py as Jeremy did. I don’t know how to construct the sweep id for creating dictionaries for the dataframe? Any hint? Thanks.

suvash · June 16, 2022, 10:20pm

I really like using the lsof command for this category of issues.

lsof -i :4321 # gets the process that "opened" the port number 4321

Pretty handy with hunting down (process that) occupied local ports, locked/opened files etc., lsof (list open files) does a lot more & is included with most distros.

msivanes · June 20, 2022, 2:59pm

Adding the blogpost mentioned by Jeremy about the experiments on the topic 31:30 - Brute force hyperparameter optimisation vs human approach.

tcapelle · June 22, 2022, 4:09pm

Ohh, I can help you with that on the live session on Friday 24. I updated the Readme to make it more clear.

bilalUWE · June 22, 2022, 6:00pm

That would be great. Would you please share the link to the live session on 24th. Thanks.