Low accuracy with resnet50 for kaggle dog breeds competition

Hi there!

I finished v1 part1 around the holidays, and with a few people from work have been trying different networks for a kaggle competition: https://www.kaggle.com/c/dog-breed-identification. More to see what gives us better results than actual submission. I decided to give resnet50 a go, and came out with really low accuracy, like < 1% with my sample set. With the full dataset, it unfortunately is not much better. However if i use the same sample set with VGG16 i get about 31% accuracy on first go. Which seems like an okay starting point.

I’ve tried things like batch norm, image augmentation, re-adjusted my sample sets, etc. Haven’t been able to improve resnet. I put together a quick code sample, reproducing it with the resnet50.py provided in the course files. (I was using the keras.applications.resnet50).

Is there something I’m missing, or is resnet perhaps ill suited for this case?

from keras.preprocessing import image
from __future__ import print_function
import resnet50; reload(resnet50)
from resnet50 import Resnet50 as RN50

path = "dog-breeds/sample/"

model = RN50()
batches = model.get_batches(path + "train", batch_size=16)
val_batches = model.get_batches(path + "valid", batch_size=16, shuffle=False)
model.finetune(batches)
model.fit(batches, val_batches)

To talk about the data for a second. The competition has 120 categories. Each category has anywhere from 66-110ish images. For each category i took 10% of the train images and moved them into a validation folder. For the sample set, i took 20 of each category for training, and 5 for each validation. This does feel really small to me, but as I said the full dataset doesn’t fair much better for resnet.

Here’s the output of the single epoch finetune:

Found 2400 images belonging to 120 classes.
Found 600 images belonging to 120 classes.
Epoch 1/1
2400/2400 [==============================] - 28s - loss: nan - acc: 0.0083 - val_loss: nan - val_acc: 0.0083

Loss being nan is also concerning. I haven’t seen this in my larger notebook, much larger loss numbers though. 10-15.

Now here’s the VGG:

Epoch 1/1
2400/2400 [==============================] - 36s - loss: 4.9095 - acc: 0.3150 - val_loss: 2.3816 - val_acc: 0.5317

To compare, here’s resnet against the full set:

Found 9189 images belonging to 120 classes.
Found 1033 images belonging to 120 classes.
Epoch 1/1
9189/9189 [==============================] - 95s - loss: nan - acc: 0.0078 - val_loss: nan - val_acc: 0.0077

The new pytorch version of the course tackles this dataset with resnet FYI - and gets great results! :slight_smile:

Interesting! I’m looking to start v2 pretty soon, so I have that to look forward to. Running Cuda 8 on this machine, with keras & theano. So i’ll have many things to update. Though I find it interesting the pytorch version of essentially the same network performs better. :thinking:

Im getting similar results, havent tried running this with pytorch yet. Why are the results that much better?

Why the initial accuracies on fast ai is even better than pytorch or keras. When we start training as we are using pretrained model initial accuracy must be similar in both cases but fast ai gives better accuracy than even pytorch.