After building my first few models of cats vs dogs for the kaggle competion I got curious about how well some of the other imagenet solutions perform as starting points for transfer learning. I was also curious about other architectures for image processing. I looked at VGG16, Resnet50 & Inception V3, and also compared @jeremy’s Vgg16 wrapper to the built in Keras function.
My initial starting point was Jeremy’s Vgg16 with some data augmentation and a learning rate of 0.001. After some experimentation I found my best model on 2 epochs I was able to achieve an accuracy of 99.1% on both training and validation which is a great starting point. My hope was that the other architectures would give similar performance and I’d be able to ensemble them but that wasn’t the case.
Next up was Inception V3, which after more experimentation maxed out for me at around 96.5% after 2 epochs at 0.001 and 6 epochs at 0.0001. I was a little disappointed with the performance and I’m still a little suspicious that there might be some undocumented image preprocessing that the keras implementation isn’t doing (more on that suspicion later).
Resnet50 performed a little better achieving 98.6% validation and training accuracy after 3 epochs at 0.001 and 6 epochs at 0.0001.
Finally the VGG16 Keras implementation after 2 epochs had a 97% validation and training accuracy, which is much lower than the implementation by @jeremy. I’m almost certain now that what’s missing is the proper preprocessing layer but I’m struggling to insert that layer into the existing models. I’ve tried creating a lambda layer and then adding the vgg16 model to that but it doesn’t seem to properly connect to the model.
Any thoughts? I would like to be able to use these other models natively, rather than have to build them from scratch, and if the native implementations really don’t have the correct preprocessing that should probably be corrected in the library.
Here’s the code for the lambda layer preprocessing that i’m struggling with:
preprocess = Sequential()
preprocess.add(Lambda(vgg_preprocess, input_shape=(3,224,224)))
preprocess.add(vgg) #vggmodel from keras
x = vgg.output
x = GlobalAveragePooling2D()(x)
x = Dense(1024, activation=‘relu’)(x)
predictions = Dense(2, activation=‘softmax’)(x)
model = Model(input=preprocess.input, output=predictions)
Gives me the error that there are multiple outputs. And if I move it after the assignment of predictions I get an error saying that the graph is disconnected and it can’t obtain a value for the input to vgg. There’s probably an easy way to do this or something obvious I’m missing, but i’ve been digging around the Keras forums/github for a while to no avail.