Hi there!
I finished v1 part1 around the holidays, and with a few people from work have been trying different networks for a kaggle competition: https://www.kaggle.com/c/dog-breed-identification. More to see what gives us better results than actual submission. I decided to give resnet50 a go, and came out with really low accuracy, like < 1% with my sample set. With the full dataset, it unfortunately is not much better. However if i use the same sample set with VGG16 i get about 31% accuracy on first go. Which seems like an okay starting point.
I’ve tried things like batch norm, image augmentation, re-adjusted my sample sets, etc. Haven’t been able to improve resnet. I put together a quick code sample, reproducing it with the resnet50.py provided in the course files. (I was using the keras.applications.resnet50).
Is there something I’m missing, or is resnet perhaps ill suited for this case?
from keras.preprocessing import image
from __future__ import print_function
import resnet50; reload(resnet50)
from resnet50 import Resnet50 as RN50
path = "dog-breeds/sample/"
model = RN50()
batches = model.get_batches(path + "train", batch_size=16)
val_batches = model.get_batches(path + "valid", batch_size=16, shuffle=False)
model.finetune(batches)
model.fit(batches, val_batches)
To talk about the data for a second. The competition has 120 categories. Each category has anywhere from 66-110ish images. For each category i took 10% of the train images and moved them into a validation folder. For the sample set, i took 20 of each category for training, and 5 for each validation. This does feel really small to me, but as I said the full dataset doesn’t fair much better for resnet.
Here’s the output of the single epoch finetune:
Found 2400 images belonging to 120 classes.
Found 600 images belonging to 120 classes.
Epoch 1/1
2400/2400 [==============================] - 28s - loss: nan - acc: 0.0083 - val_loss: nan - val_acc: 0.0083
Loss being nan is also concerning. I haven’t seen this in my larger notebook, much larger loss numbers though. 10-15.
Now here’s the VGG:
Epoch 1/1
2400/2400 [==============================] - 36s - loss: 4.9095 - acc: 0.3150 - val_loss: 2.3816 - val_acc: 0.5317
To compare, here’s resnet against the full set:
Found 9189 images belonging to 120 classes.
Found 1033 images belonging to 120 classes.
Epoch 1/1
9189/9189 [==============================] - 95s - loss: nan - acc: 0.0078 - val_loss: nan - val_acc: 0.0077