Neural Style Transfer : For transferring the style of paintings/art to a new image.
Inception model is memory intensive; ResNet works really well for most cases.
Fine tuning -> makes a classifier better incase of fine grained categories.
Why we need to train last layers --more abstract, high level combination of features are captured by the later layers and they might need some additional parameter tuning to improve classification of fine grained classes.
To fine tune things we should not use a large learning rate (alpha) and instead use the LR finder to search for optimal range of values. In general, larger learning rates might not lead to convergence to the local/global optima.
Rule of thumb : add slice(x, y) in max_lr where,
x -> value from LR finder plot before things started getting worse (ideally 10x before such point)
y -> value 10 times smaller than learning rate used in the 1st stage
Additonally, learning rates should be used by taking the corresponding losses into account from the LR plot.
I am trying image classification using resnet34 architecture. I trained the model for 5 initial cycles and then did unfreeze and trained again. My loss spiked magnanimously after unfreezing. I know this behaviour has been discussed before on the forum but my question is more granular. When we unfreeze the network and train, how could weights get worse than before unfreezing?
My understanding is, weights should only improve. What exactly happens to network weights after we unfreeze?
So after this lesson, I worked really hard on gathering a data set of images of Goku and Vegeta from the show Dragon Ball Z. I quickly learned how difficult it is to get a working, good data set of images. My data set now has over images 300 images (on private) on my kaggle account. I did all of the instructions and it felt really good to go from start to finish (gathering to training). There were a lot of errors to overcome. My model is around 90% accurate at telling if the picture is of Goku or Vegeta.
Even though I did these things, there’s still so many things I don’t understand. I don’t really get anything besides how to feed things to these functions. But even then, I don’t understand how many cycles I should be doing, I don’t understand what exactly the learner is and how it’s connected to the model, I don’t understand what to be looking for when fitting or tweaking a model, I don’t really get anything. I know these things will be cleared up in the future, and I’m excited to move onto the next lesson.
Hi there- I just finished the first lesson and it was awesome. However, using a p2.xlarge on an ec2 instance following the instructions in the course pages, fitting the model each time was very slow. For example, training the CNN first time for the pets for me took ~5 minutes vs the 2 for Jeremy. The MNIST example took 10 minutes. I wonder if there’s something i’m missing about configuring to process using the GPU?
Hi Everyone,
I was trying to build an Image Classifier to classify rugby vs american football images after watching Lesson 1 2019 video.I am getting an error rate of 0.02 but I have the following doubts:
1.I have stuffed around 100 images each of American Football and rugby into a google drive folder and used the Path() function to specify the path of that folder as the path of my images.Is that the right way to specify a dataset?
2.The most confused images(as attached) show that the model predicts rugby less confidently but actually it should have predicted rugby with a higher probability.Can someone please explain exactly what does that mean?
3.interp.most_confused(min_val=2) outputs [ ] .Is that the case for all datasets which have only 2 classes?
4.In the cricket vs baseball example(from the video),plot_with_title() was used but I am currently unable to use that function.Is the function still available?
If anyone could clear these doubts,it would be really helpful.
Thank You!
I took a database of face expressions http://www.kasrl.org/jaffe.html (213 images of 7 facial expressions (6 basic facial expressions + 1 neutral) posed by 10 Japanese female models)
DI - Disgust
SA - Sadness
AN - Anger
HA - Happy
FE - Fear
SU - Surprise
NE - Neutral
I ran the resnet50, over 50 cycles (I trust something on the fastai library won’t let me overfit. The error rate went from 80% to 30% on 10 cycles, and finally 12% on all 50 cycles. Wow.
Unfreezing the learning, and doing 20 cycles. It went up from 11% error to 50% error in the 4th cycles, and then slowly down to 4% in the last 3 cycles. I have the strong suspicion this is is still overfitting. It is astonishingly good.
Hi,can you tell me what is meant when interp.plot_top_losses gives results like AN/AN/0.45/0.64?
From what I understood after watching Lesson 1,in this case,The model predicted the expression of the face as Anger with a probability of 45% while it should have predicted the expression of the face as Anger with a probability of 64%.What is the significance of this slight difference and why does such an error occur?
Overfitting would mean that you validation loss gets worse as you continue training. It’s in the nature of fut_one_cycle that the loss gets worse at first, but as long as it’s going down in the end you’re fine.
One thing I noticed: when you unfreeze your model, you have to call lr_find after you call unfreeze(). In the notebook you call it before, which really doesn’t give you the information you need for choosing the learning rate.
@gluttony47 I have also just stared trying to go through the tutorials and I am trying to use gcp as well. I have hit the same issue as you. I also have fastai installed in my conda list. I was wondering if you had heard anything and I wanted the issue to be seen again as well.
Based on what is taught in Lesson 1, I trained a resnet50 model to classify pictures of romanesque cathedrals vs. gothic cathedrals. I achieved an error rate of 5.1%. My notebook along with the textfiles containing the urls of the images I used for training and validation can be found here: https://github.com/g-vk/fastai-course-v3
Great job @gvolovskiy…I have a doubt…following the notebook lesson, I understood that the learning rate must be found before the unfreeze step. Something like that:
learn.load(‘stage-1’)
learn.lr_find()
learn.recorder.plot()
learn.unfreeze()
Pls let me know!
Before unfeezing, the set of trainable parameters is the same as when training the head of the model. Since after loading the stage-1 weights the weights of the head of the model are already trained, there is no need to train them again and hence also to look for a good learning rate. It is only after we enlarge the set of trainable parameters by invoking learn.unfreeze() that looking for a suitable learning rate becomes necessary.
I hope my explanaion was helpful for you.