Walkthru 12 a detailed note in the form of questions
00:40 How does Jeremy evaluate models on PETS dataset?
What’s interesting about the top 15 models? They got bits of everything (all different sorts of approaches)
01:53 What’s interesting of vit models for PETS dataset?
with large vit models for large images, they probably can have lower error rate.
the best performed vit models are the early ones, suggesting vit has not improved much since it came out
03:38 How does Jeremy evaluate modesl on planets dataset?
04:54 What does Jeremy mean by large model or models using large images?
06:03 How to use small and large pre-trained models when you do Kaggle or production projects?
09:16 Run Jeremy’s paddy notebooks and compare with yourself to find errors or differences
#question Has Jeremy shared his paddy notebooks into his paddy repo yet?
10:45 What is the difference between pth and pkl when saving model?
the format of model is in pkl, the variant extension used by pytorch is pth;
just make sure to save models in pth extension
11:56
12:33 If the disease-classification model can also learn to classify paddy varieties, then it may predict diseases even better. Why is it?
15:16 Why building the above model together would be a good way to test how well we understand the mechanisms of deep learning?
16:14 How does Jeremy test for the natural variation of the model’s performance?
How does Jeremy know the model’s performance is consistent or not?
What does it mean when the model’s performance (error-rate) jumps a lot?
18:34 Why it is intuitive/counter-intuitive that everything you experiment working on small models will magnify themselves in large models?
In other words, why should we exhaust experimenting with small models first and then move on to large models?
20:04 What consist of a model?
What does the body of a model look like and what does it do?
What does the head of a model look like and what does it do
How to open up the inner details of a model
21:46 How to show the shapes of input and output flow and callbacks at each layer for the entire model?
23:48 How to extract the head of the model and then the last layer of the head?
24:07 How to see the parameters of the last layer?
How to show the content of the parameters when ll.parameters()
is too lazy to show the content?
What the shape of the last layer?
25:51 How to use one model to predict both 10 diseases and 10 varieties of paddy?
Will we replace the original last layer (10x512) of the head with a different last layer (20x512)? NO
Will we build two linear layers instead of one onto the head?
27:51 How to remove the last layer from the head?
28:38 How to create a DiseaseAndTypeClassifier
class to build two linear layers for the head?
How to create the __init__
function of DiseaseAndTypeClassifier
class
32:20 How to create the forward
function of DiseaseAndTypeClassifier
class and what does this forward
do?
34:25 What amazed Radek from this line of code dtc = DiseaseAndTypeClassifier(m)
class? we now actually created a new model which trains two separate linear layers (one for predicting varieties, the other for diseases) at the same time
Don’t forget the boilerplate of creating a Sequential layer using pytorch? super().__init__()
36:44 How to duplicate/copy the existing learner to make a new learner?
How to add our new model onto the newly duplicated/copied learner?
37:11 How Jeremy sort out a half-precision
error when making a prediction with the new model
When things mess up without a clue, try to do restart the kernel first
#question When copy learner, maybe it is safer to use deepcopy
?
42:19 How to get to the point that a copied learner (without adding the new model) can do get_preds
without error?
#question Did Jeremy at this point assume that after the copied learner added our new model can still do get_preds
without error? I think he assumed it at this moment since he didn’t run the code for it
43:46 How to create a DummyClassifier
instead of using our DiseaseAndTypeClassifier
and add it to the copied learner to see whether the new learner can still do get_preds
?
This is a great demo on how to solve the problem by taking a step at a time! Very thorough and a very solid-science work style!
44:44 How to create DiseaseAndTypeClassifier
to nicely build (get codes tidy and clean) the new model with two separate layers for disease and variety prediction and add this model to the copied learner and check whether get_preds
works or not?
Now the error tells us we need to have loss function, assuming no other problems with running get_preds
47:10 Why we should expect an error about loss or loss func?
reminder: what is a loss function or loss?