Lesson 1 In-Class Discussion ✅


Having this problem while running the code in Colab. Can anyone help me rectifying error.

It turns out that the code is showing error when i am trying to run it through GPU, when i switch the run time type to none it worked out fine.

Can someone tell me where the problem is?

Lesson#1 key takeaways :

  • Cats and Dogs (http://www.robots.ox.ac.uk/~vgg/publications/2012/parkhi12a/parkhi12a.pdf) has 12 cats + 25 dogs breeds = 37 categories

  • Fine-grained classification is for categories which are very similar.

  • A standard image size is 224 pixels.

  • Centre cropping is really useful to resize images to desired dimensions. It does a combination of cropping, resizing, padding etc.

  • Images have pixel values in the range [0, 255]

  • Normalizing images is important (Note : Here its referring to standardization https://stats.stackexchange.com/questions/10289/whats-the-difference-between-normalization-and-standardization) where then each of the 3 RGB channels will have a mean of 0 and std dev of 1 using \frac{x-\mu}{\sigma}

  • Pre-trained ResNet models on ImageNet with two architectural variants are used : 34 & 50 layers

  • Transfer Learning is the core behind getting ridiculously high accuracy with a relatively few training examples.

  • Can use very few examples of images (from new classes) to train a new classifier by using transfer learning

  • One Cycle Learning (https://sgugger.github.io/the-1cycle-policy.html)

  • Error rate = 100 - classification accuracy

  • Neural Style Transfer : For transferring the style of paintings/art to a new image.

  • Inception model is memory intensive; ResNet works really well for most cases.

  • Fine tuning -> makes a classifier better incase of fine grained categories.

  • Why we need to train last layers --more abstract, high level combination of features are captured by the later layers and they might need some additional parameter tuning to improve classification of fine grained classes.

  • To fine tune things we should not use a large learning rate (alpha) and instead use the LR finder to search for optimal range of values. In general, larger learning rates might not lead to convergence to the local/global optima.

  • Rule of thumb : add slice(x, y) in max_lr where,

x -> value from LR finder plot before things started getting worse (ideally 10x before such point)
y -> value 10 times smaller than learning rate used in the 1st stage

  • Additonally, learning rates should be used by taking the corresponding losses into account from the LR plot.

Hope it helps someone! :smiley:

1 Like

I am trying image classification using resnet34 architecture. I trained the model for 5 initial cycles and then did unfreeze and trained again. My loss spiked magnanimously after unfreezing. I know this behaviour has been discussed before on the forum but my question is more granular. When we unfreeze the network and train, how could weights get worse than before unfreezing?
My understanding is, weights should only improve. What exactly happens to network weights after we unfreeze?

So after this lesson, I worked really hard on gathering a data set of images of Goku and Vegeta from the show Dragon Ball Z. I quickly learned how difficult it is to get a working, good data set of images. My data set now has over images 300 images (on private) on my kaggle account. I did all of the instructions and it felt really good to go from start to finish (gathering to training). There were a lot of errors to overcome. My model is around 90% accurate at telling if the picture is of Goku or Vegeta.

Even though I did these things, there’s still so many things I don’t understand. I don’t really get anything besides how to feed things to these functions. But even then, I don’t understand how many cycles I should be doing, I don’t understand what exactly the learner is and how it’s connected to the model, I don’t understand what to be looking for when fitting or tweaking a model, I don’t really get anything. I know these things will be cleared up in the future, and I’m excited to move onto the next lesson.

1 Like

Continuing the discussion from :memo: Deep Learning Lesson 1 Notes:

Hi there- I just finished the first lesson and it was awesome. However, using a p2.xlarge on an ec2 instance following the instructions in the course pages, fitting the model each time was very slow. For example, training the CNN first time for the pets for me took ~5 minutes vs the 2 for Jeremy. The MNIST example took 10 minutes. I wonder if there’s something i’m missing about configuring to process using the GPU?

Thanks in advance

Nevermind - I actually found the answer here: SageMaker is very slow

Hi,

Under the “Other data formats” heading in Lesson 1:

When running the following 2 lines:

tfms = get_transforms(do_flip=False)
data = ImageDataBunch.from_folder(path, ds_tfms=tfms, size=26)

I get the following error:

“/opt/conda/envs/fastai/lib/python3.6/site-packages/fastai/data_block.py:451: UserWarning: Your training set is empty.”

(followed by an index error).

I haven’t modified anything, just run the whole notebook top to bottom. Everything above this works fine.

Any help on this would be much appreciated :slight_smile:

I’m running on Paperspace.

Thanks.

Ian

Hi Everyone,
I was trying to build an Image Classifier to classify rugby vs american football images after watching Lesson 1 2019 video.I am getting an error rate of 0.02 but I have the following doubts:

1.I have stuffed around 100 images each of American Football and rugby into a google drive folder and used the Path() function to specify the path of that folder as the path of my images.Is that the right way to specify a dataset?
2.The most confused images(as attached) show that the model predicts rugby less confidently but actually it should have predicted rugby with a higher probability.Can someone please explain exactly what does that mean?
3.interp.most_confused(min_val=2) outputs [ ] .Is that the case for all datasets which have only 2 classes?
4.In the cricket vs baseball example(from the video),plot_with_title() was used but I am currently unable to use that function.Is the function still available?

If anyone could clear these doubts,it would be really helpful.
Thank You!

Sharing my test application for Lesson 1.

Detecting Face Expressions.

The error rate in the paper from 1999 is 25%, here it’s 4%

Full notebook

I took a database of face expressions http://www.kasrl.org/jaffe.html (213 images of 7 facial expressions (6 basic facial expressions + 1 neutral) posed by 10 Japanese female models)
DI - Disgust
SA - Sadness
AN - Anger
HA - Happy
FE - Fear
SU - Surprise
NE - Neutral

I ran the resnet50, over 50 cycles (I trust something on the fastai library won’t let me overfit. The error rate went from 80% to 30% on 10 cycles, and finally 12% on all 50 cycles. Wow.


then I tried to unfreeze. For the learning rate plot, I chose 1e-5,1e-2

Unfreezing the learning, and doing 20 cycles. It went up from 11% error to 50% error in the 4th cycles, and then slowly down to 4% in the last 3 cycles. I have the strong suspicion this is is still overfitting. It is astonishingly good.

2 Likes

Hi,can you tell me what is meant when interp.plot_top_losses gives results like AN/AN/0.45/0.64?
From what I understood after watching Lesson 1,in this case,The model predicted the expression of the face as Anger with a probability of 45% while it should have predicted the expression of the face as Anger with a probability of 64%.What is the significance of this slight difference and why does such an error occur?

Overfitting would mean that you validation loss gets worse as you continue training. It’s in the nature of fut_one_cycle that the loss gets worse at first, but as long as it’s going down in the end you’re fine.

One thing I noticed: when you unfreeze your model, you have to call lr_find after you call unfreeze(). In the notebook you call it before, which really doesn’t give you the information you need for choosing the learning rate.

someone please help:

ImportError Traceback (most recent call last)
in
----> 1 from fastai.vision import *
2 from fastai.metrics import error_rate

ImportError: No module named ‘fastai’

i am running GCP … i did everything as mentioned in the tutorial post about using GCP. this is such a shame that i am stuck before i can even start.
image

i feel so shitty…i already feel like giving up…

“conda list” command into my remote computer from GCP shows that fastai is installed
image
basically everything is installed and the notebook kernel (python3) can not even find fast ai module. what do i do?


i am not willing to give up, but this is very DEMOTIVATING

@gluttony47 I have also just stared trying to go through the tutorials and I am trying to use gcp as well. I have hit the same issue as you. I also have fastai installed in my conda list. I was wondering if you had heard anything and I wanted the issue to be seen again as well.

@gluttony47 I found a solution using a cobination of threads.

I used this thread to create a conda environemnt: Fastai v0.7 install issues thread

I then used this thread to actually run the notebooks. You have to activate the environment, then start a jupyter notebook port, then start another ssh connection in another bash and connect there: How to setup fastai v0.7 on gcp instance that is setup for 2019's Part 1

Based on what is taught in Lesson 1, I trained a resnet50 model to classify pictures of romanesque cathedrals vs. gothic cathedrals. I achieved an error rate of 5.1%. My notebook along with the textfiles containing the urls of the images I used for training and validation can be found here: https://github.com/g-vk/fastai-course-v3

Here is a preview of the notebook:

2 Likes

Notes of Lesson 1 with some of my additions and clarifications. Hope some will find it helpful. Feel free to leave any comments.

1 Like

Great job @gvolovskiy…I have a doubt…following the notebook lesson, I understood that the learning rate must be found before the unfreeze step. Something like that:
learn.load(‘stage-1’)
learn.lr_find()
learn.recorder.plot()
learn.unfreeze()
Pls let me know!

Before unfeezing, the set of trainable parameters is the same as when training the head of the model. Since after loading the stage-1 weights the weights of the head of the model are already trained, there is no need to train them again and hence also to look for a good learning rate. It is only after we enlarge the set of trainable parameters by invoking learn.unfreeze() that looking for a suitable learning rate becomes necessary.
I hope my explanaion was helpful for you.

1 Like