Does overfitting matter if validation accuracy is higher?

grahac · June 2, 2017, 2:26am

First and foremost - thank you so much for putting this course together. It is awesome. I have been studying/playing with DL for a long time and for the first time I feel like I am starting to feel like I am getting it. (It feels like I am gaining a super power!).

I have a question on overfitting. I am running a CNN on private data to try to identify which pictures are the most clicked on based on just the image.

I ran a standard VGG16 BN and ended up with a test accuracy of .60 and a validation accuracy of 0.65.

I re-ran it after removing drop out and now have a test accuracy of .85 and a validation accuracy of 0.74!.

I am very excited as the validation accuracy is much much higher, but it now is an overfitting model. My question:

Given the second model is overfitting in a big way- is it still better than the first model or does the fact it significantly overfits outweigh the fact the validation accuracy is much higher?

I would assume the validation accuracy is what matters in the end of the day and that overfitting only matters in that it means your model cannot continue to improve with future epochs but just wanted to make sure.

Also - given I am now overfitting, should I re-introduce a smaller dropout?

Thanks again!

niazangels · June 2, 2017, 8:13pm

Overfitting means good performance on the training data, poor generalisation to other data. So technically this might not be overfitting.

I’m curious to know what your training accuracy was. You mentioned your test accuracy and your validation accuracy. Was your training accuracy closer to the validation accuracy or the test accuracy?

If the training accuracy is closer to the test accuracy than the validation accuracy, then perhaps you simply have a bad representation of data in your validation set. A good validation set should be a representation of your test data or what you’ll see in your real world. If your validation accuracy is not not in correlation with your test scores, its best that you try to work on making a better validation dataset.

In the Statefarm distracted driving competition, it was mentioned that the drivers in the training set will not be present in the test set. So in this case, when we created a validation set, we had to make sure that the validation set did not contain any driver in our training set. Little things like this tend to throw us off a lot, so its worth our while to prepare our data carefully.