Thank you for that post, that was very useful - here is what happened:
-
following Rachel’s post, I split my training and test set more rigorously (the key here was to have all the images of the same city either in the train set or the test set, and not in both). For anyone running into the same issue, split your sets yourself, then use
ImageDataBunch.from_csv
orImageDataBunch.from_df
. -
Unfortunately the result of this was a much lower test accuracy - around 60%; and no fiddling with the training seemed to improve the result. So I went and got more data
-
I downloaded data from the largest 25,000 cities in the world; around 100,000 labeled images in the dataset. Interestingly, resnet34 (which was better on the small dataset) turned out to be much worse than resnet50 on the large dataset. Resnet34 got to about 75% accuracy while resnet50 easily got to 80% Finetuning also helped a lot with this larger dataset, almost 5 points improvement.
Conclusion: with this new model, accuracy is now back to 85% (using rigorously split train and test sets).
The result is extremely pleasing, and the webapp is now almost creepy to try out; it seems to get almost everything right: yourcityfrom.space