High accuracy for train/validation, but much lower accuracy for test

Hi everyone!

I’m currently training a classifier, managed to get the accuracy to a decent level:

Stage 1 (transfer learning):
image

Stage 2 (unfreeze network and training whole network):
image

The validation accuracy of 97.7% looks really good, but when I tried to use new unseen data to test the results, I only got 61% accuracy.

Seems like the model has overfitted to both the training and validation data? I would have thought that if the training and validation accuracies are both good, then there is a good chance that it would work well for the test set too?

Does anyone have any insight on what I should do next? What are the regularization methods available in fastai that I can use?

Thanks in advance!

How representative of your test set is your training data? I always find a drop in accuracy of a few percent or so of my models against test sets so it’s not unheard of. Try training for less and see if that helps.

The test data is collected in the same way as the training/validation data, so they should theoretically be from the same distribution.

I’ll try training for less epochs and see whether things improve.

Another strategy might be to collect more training data. I would start with error analysis on the test set, to figure out what data the model is getting wrong, and try to make sure I have similar data in the training set.

Good call on the error analysis, to check that the test set is similar to training set. Will do that next!