Live coding 11

1 Like

It is 1am UK time but I’ll try my best. If the session is recorded that’d be helpful in case I can’t make it. Thanks.

10am CET should be 9am UK time.

My bad. Thanks for the clarity. Then I can make it. See you in the meetup.

walkthru 11 detailed notes in questions

00:00 - Recap on Paddy Competition

04:30 - Tips on getting votes for Kaggle notebooks

07:30 - How to disconnect the other sessions in tmux?

When you have 3 running sessions on different machines, you can shift + d and select the one to disconnect

09:10 Welcoming a new comer

10:30 2 - Weights and Biases Sweep

What does WandB do?
What is sweep?
How to create a sweep?
What does fine_tune.py tell WandB run and what info to extract?
What does the fastai-WandB integration do for you?

14:40 - WandB can track GPU metrics

16:40 1 - What can fastgpu do for you? What Jeremy’s plan for fastgpu in the future?

#question Should we use fastgpu with paperspace for automating multiple notebooks?

What’s Jeremy’s opinion on WandB?

18:05 What does sweep.yaml file look like for doing hyperparameter optimization search

20:00 - How to access all your git repo’s information?

your-git-repo# cat .git/config

24:49 - How to extract the info we need from the experiment results from WandB for further analysis

Model Analysis Repo

25:05 - Why does Jeremy have to rerun the sweep experiment?

cropping is in fact usually better than squish for Resize, not the other way round.

26:00 - Why using WandB API with Jupyter notebook is so much better for Jeremy?

Does the parallel coordinates chart on wandb actually worth our attention for examining the experiment results? No, unfortunately

31:30 - Why Jeremy’s approach to hyperparameter optimisation is more practical and benefitical than brute force

Who taught WandB hyperparameter optimization?

Did Jeremy used hyperparameter optimization method once and just for finding the best value of dropout?

32:33 What’s Jeremy’s human driven approach to hyperparameter?

Why you don’t have to do a grid search for hyperparameters?

What does Jeremy do to make the human driven approach efficient and effective?

How does Jeremy accumulate knowledge of deep learning through these experiments?

What’s the terrible drawback or downside of doing brute force hyperparameter optimizations?

34:51 Do many hyperparameter values Jeremy found through experiments applicable to different architectures/models/datasets?

Is there some exceptions? yes, tabular dataset

It’s crazy that no one have done serious experiments to figure out the best hyperparameters for vision problems in segmentation, bounding boxes, etc

37:30 - Why does Jeremy not using learn.lr_find any more?

39:39- How to find out where a jupyter notebook is running behind the scene?

ps waux | grep jupyter

42:00 - How to get program running in the Background in terminal?

ctrl + z to stop running a program in terminal
bg 1 or 2 or ... to continue running the program in the background
jupyter notebook --no-browser & to run in the background
fg and ctrl + c to kill a program

How to search

46:20 - How to iterate and improve by duplicating notebooks with different methods or modified models

Jupyter: output toggle feature

49:51 Why does Jeremy focus on the final error-rate differ from tta result for each model?

tta is what Jeremy use in the end, final error-rate of training is for reference I think.

50:50 - How to build models on vit_small_patch16_224 pretrained model

#question Why Jeremy chose 3 models to build for each pre-trained model? squish, cropping and padding

52:05 How to evaluate models built on swinv2_base_window12_192_22k

53:36 - How to build models on large pre-trained models (paddy large notebook) from paddy small notebook

Why to remove the seed=42 and why it is fine?

54:52 - What models did Jeremy use the final submission up to now?

55:50 - How important/better is Model stacking/ensembling than individual outstanding models?

57:00 - Keeping track of submission notebooks

How to become a good deep learning practitioner

3 Likes