Strange behavior with data and lr_find()

There is something going on with my model that is making it have a very high error rate in the beginning:

I am not sure if this is normal. My model does have some pictures which make it difficult to identify the object, but for most of them, the answer is pretty obvious.

Obviously, if I train it more, the error rate will continue to decline as such:

If I train it one more time, this is what I get:

I have followed all of the steps up until when I plot the lrs in the following strange graph:

When I train it a second time, I get this lr graph:

And after training the model for the third time, I get this lr graph:

Could someone tell me what is going on? I tried to follow the same steps the professor did in the 1st video of part 1. I am currently using the most up-to-date version of the fastai library, if that matters.

Hey Michael !

  1. Why do you say this is a high error rate, how many classes do you have ? Don’t forget that at the very beginning, your model won’t be much better than random, so your error rate will be ok if it is better than random after the first epoch. For example, if you have 10 classes, the random error rate will be 0.9, so please check how many classes you have :slight_smile:

  2. I think you have uploaded twice the same image of your training step.

  3. Why do you say that the graph is strange ? If it is because you don’t see the “divergent” part of the graph, then you should maybe play with the skip_end parameter, like learn.recorder.plot(skip_end=0), for example.

Hope that helps !


Hey there, I have 4 classes. Also, I say that the graph is strange because the loss goes down as the learning rate gets higher, which is not the same behavior that occurred in the lesson.

When I tried to train the unfreezed model through one epoch after choosing the lr_range(where the biggest downwards slope is), I still get a high error rate whereas the professor got a low one.

After more experience you will love that your graph looks like that. :slight_smile: Its when it doesn’t look like that you should start to worry. Looks like you can keep training for even better results. When the graph starts to go up, you know you’re near your best possible result.

From the, this is how your lr_find will most of the time look when you use a pretrained model. You can clearly see 3 phases here:


  1. When the learning rate is too small (from 1e-6 to 1e-4), the steps that the parameters of your model will take will be very small, resulting in almost no decrease in your loss.

  2. You then enter in the interesting part of the graph, where the loss value decreases really fast (from 1e-4 to 1e-1), you usually want to take a learning rate value as high as possible, without being to close of the minimum (a good rule of thumb is taking that value and divide it by 10, so 1e-2 in our case)

  3. Your learning rate is now too high, resulting in a divergent behaviour and an exploding value of the loss.

So you don’t have to worry about your graph, it is instead really promising ! :slight_smile:

About your high error after training the unfreezed model, could you please upload some images of your training and lr_finder ? I’ll take a closer look :wink:

1 Like

Hey I fixed the error rate problem! I used 1e-4 and 1e-3 instead of 1e-04 and 1e-03.:upside_down_face:

Thank you so much for your post @NathanHub

If on our model’s initial lr_find plot, if we don’t have a similar shape to graph in lecture or your plot, does that indicate any sort of problem.

I am not sure where I would select a lr given my plot as it doesn’t have the phases you mentioned. Any thoughts appreciated!

Hi @aksg87 !

I’m pretty sure that your graph would be the same if you plotted a wider range (e.g from 1e-8 to 1e-3), you should then get the same behaviour as mine.

Hope that helps ! :slight_smile:

Thanks! Will definitely try that