I tried to overfit, help me understand the results

Hi guys,

so I wanted to see for myself what happens in lesson 1 when we really overfit the model and ran it for 500 epochs. Training loss went down (as expected), validation loss went up (as expected). Here’s what puzzles me though:
Accuracy really didn’t change. From epoch 0 to epoch 500 it was at 91%, going down half a percent, going up a tiny bit. In the end, epoch 0 had 91% and epoch 500 had 90.75%

I was expecting the accuracy to really go down with overfitting, to something like 70% or even worse… Is there some mechanism built into Resnet that prevents this?

Thanks for help :slight_smile:

The accuracy is based on 0.5 Cutoff in this case being a Binary Classifier (Cats vs Dogs). The model might just be over confident and gets penalized more for getting things wrong, resulting in validation loss going UP. At the same time, if the training loss is really low, then it has overfit to the training data, so it’s not going to help much if you keep training for more epochs.

yeah, I know that overfitting isn’t something we generally want. I just wanted to see for myself how bad the accuracy would get if I really go for it. And was kinda disappointed that the accuracy didn’t change (or not more noticeably)

Maybe it’s indeed because we have a Binary Classifier. Overfitting would for example change an estimation from 90% cat to 75% cat. Which is strictly worse but doesn’t change the accuracy because it’s still the correct classification. I imagine that in a problem with many classes the probabilites are often much closer and so overfitting would change the accuracy more noteably… I’ll give that a try later on in the course :wink:

I want to mention that it depends on what do you define as bad. In course Jeremy has presented state of the art model which achieved around 99% accuracy. Model with 91% accuracy is just way, way worse. It’s all about your definition of good and bad.

If you really want to decrease the performance of this model - unfreeze more layers. By default you are just fine-tuning the very last layer which is probably the reason why you don’t see much difference after so many epochs.

1 Like

@bottaio yes, I didn’t unfreeze, that was the problem. I reran it (though not 300 epochs cause they take at least ten times at long) and now the results were more like what I was expecting. It’s still a well-trained model so I can’t completely destroy it through overfitting but yeah, I saw more of what I was trying :wink:
So thank you :slight_smile: