Problems with overfitting DenseNet121


I have been trying to overfit the DenseNet121 model (from the pytorch model zoo) on 10 images, and have been unable to do so. The train loss drops to 0.20 after 50 epochs.

I have also tried a one convolution -> flatten -> dense output model, and that achieves a 0.002 train loss after 50 epochs.

Everything except the architecture is identical.

I really don’t understand while DenseNet121 is not able to overfit as well as the basic network, could someone please shed some light on this matter for me?

1 Like

Did you turned off ALL data augmentations?

Otherwise the NN always sees the pictures in a slightly different form and is maybe not able to fully overfit.

1 Like