Why is my neural network only predicting 2 out of 4 classes in the confusion matrix?

The code for my kernel, written on Kaggle can be found here: https://github.com/LucidLefo/retinopathy-fastai-basic/blob/master/diabetic-retinopath-fastai.ipynb

This is part of the diabetic retinopathy competition on Kaggle, where I wanted to use what I learned from the resources here to do something with the fastai library. For some reason, even though the classes go from 0-4, as per the data, it seems to only be using 0 and 2. It shouldn’t be an issue with the accuracy, as to me it was satisfactory, ending up with ~75% accuracy.

So, what am I doing wrong here?

Thank you so much!

I’m guessing its because your model isn’t learning from the retinas themselves but rather image metadata such as image size and pixel counts. This is a pitfall which was highlighted during the competition by Tom Aindow.

You can look in at other high scoring public notebooks on kaggle, or peruse the top solution write-ups in the discussion for this competition to see how others dealt with this and other pitfalls. The competition organizers had a number of tricks up their sleeves to (hopefully) insure the winning solutions generalized well, and it appears you stumbled into one of them.

I noticed on your notebook that you have a link to the first lesson. If you are just starting the fastai course, you might want to practice with a less challenging dataset and then come back to this one.

1 Like