Unable to overfit

jonadelson · August 30, 2020, 7:37pm

I am leading a study group for this course and we are on Lesson 2. I have my own dataset and am trying to overfit just to demonstrate the concept and what it might look like. The behavior I was hoping to see is that after a while the training loss would consistently be going down and the error_rate would go up. I have tried a bunch of things like switching to a resnet50 from resnet18 and running for 100 epochs instead of the 4 epochs that are in the notebook (this inadvertently led to a much better model in terms of error_rate ) . I tried learn.unfreeze(). I tried calling learn.fit instead of learn.fine_tune. I tried getting rid of data augmentation. I’m sure I’m probably missing something obvious here or not fully understanding the problem. Any help would be appreciated.

orendar · August 30, 2020, 7:39pm

Hey Jonathan!

Could you please share your code? A few things which could help demonstrate overfitting are not using a pre-trained model, not using dropout, not using weight decay and using the highest learning rate suggested by lr_finder().

jonadelson · August 30, 2020, 7:45pm

Thanks for the quick reply! I was thinking not using the pre-trained model would be helpful but haven’t tried that yet. Does the cnn_learner use dropout and weight decay by default?

My code is essentially identical to notebook 2 so I will share some things I tried.

learn = cnn_learner(dls, resnet50, metrics=error_rate)
learn.fine_tune(100)

learn = cnn_learner(dls, resnet50, metrics=error_rate)
learn.unfreeze()
learn.fine_tune(100)

learn = cnn_learner(dls, resnet50, metrics=error_rate)
learn.fit(100)

jonadelson · August 30, 2020, 7:50pm

I also omitted batch_tfms=aug_transforms() with all of these combos

ali_baba · August 30, 2020, 8:11pm

It does not use weight decay by default.

How big are your images? Did you try cranking up the epochs beyond 100? It looks like you’re using the default learning rate in that snippet, which is 0.001 – did you also try to increase this?

It might just be an issue with your dataset having enough features that it’s a bit more difficult to over fit

orendar · August 30, 2020, 8:14pm

Thanks for sharing your code!

Feel free to inspect the docs or the source code to see the arguments and their defaults - it’ll help you answer your own questions

Could you try something like:
learn = cnn_learner(dls, resnet50, pretrained=False, config=cnn_config(ps=0.), metrics=error_rate)
And let me know how training goes?

jonadelson · August 30, 2020, 8:19pm

I will try that later and let you know, thanks.

jonadelson · August 30, 2020, 8:21pm

The images are from google image search so I think they vary in size, but I am doing a random resizing to 224 (although I suspect you are asking about the original size). I have not tried to increase beyond 100 because it was basically at 0 training loss by the end, but I’m not sure if that’s a good reason to not go beyond. I also did not try increasing the learning rate, so will give that a shot as well.

Pomo · August 30, 2020, 9:06pm

Hi Jonathan. The head grafted onto resnet by fastai contains two Dropout layers. You can remove their effect by

learn.model[dropout layer].p = 0.0

That ought to get resnet to misbehave!

BTW, recently I played with a resnet would not even learn the training set. The cause was too high a setting for Dropout.