I have worked through the first two lessons and am now going through them again but this time with my own problem and data set in mind. I want to implement a model that classifies playing cards (Ace of Hearts, Jack of Spades, Four of diamonds, etc).
I created a set of images (~300) from a couple different decks of cards (all digital, not photos of cards).
My first attempt was to try to simplify the problem into two separate models. One model to classify the suit (spades, diamonds, hearts, or clubs) and another model to classify the rank (two, three, four, etc).
It seems like the model classifying the suit was able to reach 100% accuracy (but I’m worried it might be overfit?).
However, the model trying to classify the rank does not do well at all. I’m pretty surprised by this because it feels like a simpler version of MNIST where there is no bad handwriting, all the characters are perfectly drawn so it should be quite simple.
Some of the cards have the rank drawn twice on the face of the card… some of the cards have dark background (most have white background).
Any ideas for what else to try to improve the rank classifier? I’m thinking maybe an extra preprocessing step that attempts to filter out the background color so they all are black/white or grayscale might help.
The other thing I had done was remove all transformations when building the data bunch because these aren’t photos. I don’t want any cropping / rotation / etc. I guess maybe it doesn’t make sense to use ImageNET pretrained weights for this use case?