I am trying to build a VGG16 model using Keras’ built-in applications and I was doing the following:
vgg16 = VGG16(weights='imagenet', include_top=True, input_shape=(224,224,3))
vgg16.layers.pop()
vgg16.layers.pop()
vgg16.layers.pop()
for layer in vgg16.layers: layer.trainable=False
m = Dense(4096)(vgg16.layers[-1].output)
m = Dense(4096)(m)
m = Dense(25,activation='softmax')(m)
vgg16 = Model(vgg16.input, m)
vgg16.compile(optimizer="adam",loss="categorical_crossentropy")
Then I was doing a fit and getting horrible results and they kept leveling out. So my (wrong) assumption was that I had something wrong in my workflow. Well after beating my head against this issue for 3 days, I finally figured out the problem:
I don’t have enough data to train that many weights. To fix my issue, all I had to do was stop trying to retrain the two 4096 dense layers and retrain the final softmax layer. When I did this, I was able to get a much better result and I didn’t bottom out. I don’t have any questions on this one, I just wanted to share this gotcha so if anyone else is 3 days behind me, they can save themselves the headache.
Here is my final model:
vgg16 = VGG16(weights='imagenet', include_top=True, input_shape=(224,224,3))
vgg16.layers.pop()
for layer in vgg16.layers: layer.trainable=False
m = Dense(25,activation='softmax')(vgg16.layers[-1].output)
vgg16 = Model(vgg16.input, m)
vgg16.compile(optimizer="adam",loss="categorical_crossentropy")
My plan now is to get a result for resnet50 and maybe a few other models that Keras has prebuilt and then give each of them a vote before deciding my answer.