It is 1am UK time but I’ll try my best. If the session is recorded that’d be helpful in case I can’t make it. Thanks.
10am CET should be 9am UK time.
My bad. Thanks for the clarity. Then I can make it. See you in the meetup.
walkthru 11 detailed notes in questions
00:00 - Recap on Paddy Competition
04:30 - Tips on getting votes for Kaggle notebooks
07:30 - How to disconnect the other sessions in tmux?
When you have 3 running sessions on different machines, you can shift + d
and select the one to disconnect
09:10 Welcoming a new comer
10:30 2 - Weights and Biases Sweep
What does WandB do?
What is sweep?
How to create a sweep?
What does fine_tune.py
tell WandB run and what info to extract?
What does the fastai-WandB integration do for you?
14:40 - WandB can track GPU metrics
16:40 1 - What can fastgpu do for you? What Jeremy’s plan for fastgpu in the future?
#question Should we use fastgpu with paperspace for automating multiple notebooks?
What’s Jeremy’s opinion on WandB?
18:05 What does sweep.yaml file look like for doing hyperparameter optimization search
20:00 - How to access all your git repo’s information?
your-git-repo# cat .git/config
24:49 - How to extract the info we need from the experiment results from WandB for further analysis
25:05 - Why does Jeremy have to rerun the sweep experiment?
cropping
is in fact usually better than squish
for Resize
, not the other way round.
26:00 - Why using WandB API with Jupyter notebook is so much better for Jeremy?
Does the parallel coordinates chart on wandb actually worth our attention for examining the experiment results? No, unfortunately
31:30 - Why Jeremy’s approach to hyperparameter optimisation is more practical and benefitical than brute force
Who taught WandB hyperparameter optimization?
Did Jeremy used hyperparameter optimization method once and just for finding the best value of dropout?
32:33 What’s Jeremy’s human driven approach to hyperparameter?
Why you don’t have to do a grid search for hyperparameters?
What does Jeremy do to make the human driven approach efficient and effective?
How does Jeremy accumulate knowledge of deep learning through these experiments?
What’s the terrible drawback or downside of doing brute force hyperparameter optimizations?
34:51 Do many hyperparameter values Jeremy found through experiments applicable to different architectures/models/datasets?
Is there some exceptions? yes, tabular dataset
It’s crazy that no one have done serious experiments to figure out the best hyperparameters for vision problems in segmentation, bounding boxes, etc
37:30 - Why does Jeremy not using learn.lr_find
any more?
39:39- How to find out where a jupyter notebook is running behind the scene?
ps waux | grep jupyter
42:00 - How to get program running in the Background in terminal?
ctrl + z
to stop running a program in terminal
bg 1 or 2 or ...
to continue running the program in the background
jupyter notebook --no-browser &
to run in the background
fg
and ctrl + c
to kill a program
How to search
46:20 - How to iterate and improve by duplicating notebooks with different methods or modified models
Jupyter: output toggle feature
49:51 Why does Jeremy focus on the final error-rate differ from tta result for each model?
tta is what Jeremy use in the end, final error-rate of training is for reference I think.
50:50 - How to build models on vit_small_patch16_224 pretrained model
#question Why Jeremy chose 3 models to build for each pre-trained model? squish
, cropping and padding