I created a very small data set by downloading a few images of three different international currencies from Google Images. It’s definitely not the greatest data set in the world, but I was expecting to get better accuracy than what I’m getting (it would be an easy problem for a human). Typically I get about 80% accuracy, but it does vary a bit.
Challenges of this data set
- There is very little training and validation data. For example for the Hong Kong Dollars, there are only 12 training images.
- There are bills in the validation set from years that have no equivalent in the training set. For example for the Hong Kong bills, in the validation set there are red colored bills but none of the bills in the training set have this style/color.
- There are very different image sizes ranging from 1000 (w) x 500 (h) to 450 (w) x 200 (h)
One thing that I thought would improve things a lot, bit didn’t, was to switch from center cropping to no-cropping. With center cropping, it was transforming images like this:
to images like this:
Whereas with no cropping it would squash the images down to images like this:
Which seemed like it would be a lot better, since at least it preserves the word “Indonesia”. So far it didn’t seem to improve the accuracy at all though.
Also, I didn’t seem to get any improvement using the data augmentation (I tried transforms_basic and transforms_side_on). I didn’t try test time augmentation (TTA) since I don’t have a test set, and I’m a bit confused on whether I actually need one or not.
The data set
Latest Jupyter Notebook snapshot
(if you click “show original” it should take you to github with a rendered snapshot of the jupyter notebook)