Hi, I started fast.ai classes 2 weeks ago and this is very entertaining.
I tried to enter the dog breed identification challenge (https://www.kaggle.com/c/dog-breed-identification) using the vgg model as used in lesson1. The data set contains a total of 10222 pictures divided in 120 categories. I used 10% of the data to make the validation set. To be sure that there is at leas one picture of each breeds in the validation set I explicitly took 10% of each category to make the validation set.
The thing is that my model doesn’t seem able to fit well to those data.
After the first epoch results are the following: loss: 12.2193 - acc: 0.2138 - val_loss: 11.9785 - val_acc: 0.2401
After 20 epochs the results are not so much better: loss: 11.0408 - acc: 0.3108 - val_loss: 11.5460 - val_acc: 0.2807
I checked randomly some folders of the train and validation set and the data seems well classified.
Those pictures are from imagenet so I am curious about what could make my model fail as it is based on VGG16.
The average number of pictures by category is less than 100, is this possible that there is not enough data to train the model ?
Did anyone tried to enter this competition using the method of lesson1 and got such bad results ?
Thanks in advance.