Increasing ResNet34/50 accuracy at all cost?

Hi. I’ve been having issues with using FastAI for my project. I’ve been using the ResNet architecture to implement a fruit and vegetable classification system. I’ve currently been able to squeeze out 92% accuracy out of either model (34 and 50 layers), but I’m stuck there.

This seems to be the max accuracy I’ve been able to reach, though I’ve tried batch sizes, learning rates, data multiplication and augmentation, loss functions and optimizers.

I’d like to get the accuracy up to 95%, at any cost to training time. Do you think this is possible?

Thanks a lot!

Hi there!

First of all, I assume you are talking about validation accuracy, and that training accuracy is also similar…
If you already tried all those things, maybe the problem lies elsewhere. Either:

  • You don’t have enough training data, or it’s not diverse enough for the model to learn the differences. The solution here would be to clean up your data and get more of it… That could improve the accuracy.

  • The architecture you are using is not the best one for the job. Instead of a better resnet, you could try switching architectures and see if that improves things. This notebook from lesson 3 might help…

Hope this helps!

Thanks! Do you think it would be fine for me to simply clone the data say ten times (currently I have 3000 images with 33 classes) and simply apply augmentation to some of the photos?

Thanks again!

Not that much…
You have to think of your dataset as “information” you are feeding the neural network.
Augmentation can work really well for certain information “gaps”, for example having the same fruit but from slightly different angles/rotations/zooms. But at some point, it doesn’t help, and it can even harm your performance. Because you can really overfit to your multiplied data and the model ends up being worse when you feed real-world examples (not in your dataset).

You might also want to look more closely at your failed examples, that 8% in your validation set that the model gets wrong. Maybe that can provide some insight as to where you could improve your data.
For example, you could discover that the model always misses when classifying small fruits that are depicted in batches (eg a handful of blueberries). Then you could add more of those to your dataset and that will certainly improve the metrics.

Another thing you didn’t mention is how many categories are you trying to predict.
Maybe you could narrow your problem at first (eg having fewer fruit or vegetable options, or even classifying fruit vs vegetable regardless of the type) and work from there

1 Like

Thanks for the reply. Do you think it would be worth it to try InceptionResnetV2? I’ve been looking at some papers and they say that the accuracy is somewhat better than that of Resnet, but it’d be harder for me to implement in fastai.

It won’t necessarily be better. Inception models are a good way they found to solve the problem of building really deep networks, but it was state-of-the-art many years ago.

If implementing such a model would be time-consuming, I would suggest trying other out-of-the-box solutions first, to see if just changing the architecture you are using helps the accuracy.

As you mentioned training time is not an issue (I assume inference too?), might I suggest just going with bigger models? You really should check out the notebook I referenced earlier, which is part of lesson 2 if I remember correctly. There, you can see a comparison between popular architectures, and find bigger alternatives to ResNet, and all the models cited there are part of the timm library, so you can implement them quite easily in

But improving accuracy beyond a certain point will require more diverse and involved approaches, as I mentioned before: going through your data to see where the model performs badly and fixing it, getting more quality data, or even redefining the problem to overcome certain domain restrictions (Eg: beyond a certain point, distinguishing between different vegetables/fruits becomes difficult for image classification models, so you should approach it in a different perspective altogether)