I have been going through the part 2 of the DL course and working on Plant Seedling Classification on Kaggle. With the methods taught in the course, I have been able to reach 97% accuracy on the challenge but I’d like to go further and no amount of training is helping me improve it.
Information about my model:
I trained the penultimate layer first, unfroze the previous, trained them multiple times with differential learning rates and used test time transformations during final predictions.
That is where I would love to help.
They are many ways you can use to extract every last ounce of accuracy out of your model, but they will require patience.
First, check if your model is overfitting or underfitting.
- if it is overfitting,
– use regular regularization techniques like dropout in your model,
– try to collect more data
- it is underfitting
– If you are using regularization techniques such dropout make is less intensive
– Try a complex model
Once you have a model that is complex enough to grasp the problem i.e. it gives you reasonable accuracy and if you train it too long it starts to overfit. which seems the case with your model, bingo you have a good model.
Next, you need to find what your model is not getting right. is it suppose to get it right? let’s take an example
In cat and dog competition, our model is unable to predict the above picture correctly, but first 3 pictures, it isn’t supposed to get them right(the first one has both cat and dog). Because they even don’t make sense to a human. Forcing our model to correct this prediction will our model it even worse.
But, for the last picture, we can argue that it is supposed to get it right. Here is what you can do to make it happen.
–Introduce slightly, augmented multiple copies of the pictures in your training set.
–Find similar pictures and introduce them to your training set.
Next step would to ensemble it.
Train multiple models with a slight variation of data augmentation and then average their probabilities. Next step would be training multiple models of different architectures (resnet34, resne50, resnext50) and averaging their prediction to get the final prediction.
Have you tried increasing the size of the images and retraining the model?