Training accuracy goes to 50% when training on VGG top model, without averaging

My Training accuracy does not seem to get above 50% when I try to train the dogs vs cats model, when starting from an “intermediate” or “top” layer.

I am using the VGG16 model from Keras 2.0.8 with Tensorflow 1.3.
I created a TOP model with VGG and saved the features I got when training and validating. The top model just has the convolutional layers and not the fully connected (Dense) layers.
I also did NOT set the average option, so there is no averaging.

Then I create a NEW model that re-creates the last maxPooling layer and the Dense layers and I use the saved features as the input, but the training does not seem to get above 50%.
Surely I am mangling SOMETHING but the training not getting above 50% seems weird to me…

Here’s the code and results

# dimensions of our images.
img_width, img_height = 224, 224

#Set constants.
batch_size=8
epochs=3
learning_rate=0.01

#This seems to help with resource issues
limit_mem()

#Don't do this after you load all the weights
init = tf.global_variables_initializer()
sess = K.get_session()
sess.run(init)
#Top level of the data paths
DATA_HOME_DIR = 'D:/CODE/data/dogscats'

#This is where the temp files and models go
CACHE_SUBDIR = 'D:/CODE/Keras/models'

#Set path to sample/ path if desired
path = DATA_HOME_DIR #+ '/sample/'
test_path = DATA_HOME_DIR + '/test/'      #We use all the test data
results_path=DATA_HOME_DIR + '/results/'
models_path=path + '/models/'
train_path=path + '/train/'
valid_path=path + '/valid/'

#Create the vanilla VGG16 model WITHOUT the top and no pooling
topmodel = VGG16(weights='imagenet', include_top=False, pooling = None)
last_conv_idx = [index for index,layer in enumerate(topmodel.layers) 
                     if type(layer) is Conv2D][-1]
conv_model = Sequential(topmodel.layers[:last_conv_idx+1])
conv_model.compile(optimizer=RMSprop(lr=learning_rate),
              loss='categorical_crossentropy', metrics=['accuracy'])
#Show it!
conv_model.summary()

#No preprocessing at this point, except for preprocessing
train_datagen = ImageDataGenerator(preprocessing_function=preprocess_input)
validation_datagen = ImageDataGenerator(preprocessing_function=preprocess_input)
test_datagen = ImageDataGenerator(preprocessing_function=preprocess_input)

train_generator = train_datagen.flow_from_directory(
    train_path,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='categorical',
    shuffle = False)

train_filenames = train_generator.filenames
train_samples = len(train_filenames)
train_classes = train_generator.classes
train_labels = to_categorical(train_classes)

validation_generator = validation_datagen.flow_from_directory(
    valid_path,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='categorical',
    shuffle = False) #Need this to be false, so I can extract the correct classes and filenames in order that that are predicted

validation_filenames = validation_generator.filenames
validation_samples = len(validation_filenames)
validation_classes = validation_generator.classes
validation_labels = to_categorical(validation_classes)

train_features = conv_model.predict_generator(
            train_generator,
            steps=train_samples // batch_size + ((train_samples % batch_size)>0),
            verbose = 1)
save_array(models_path+'train_features_Input512.bc',train_features)
save_array(models_path+'train_labels_Input512.bc', train_labels)
validation_features = conv_model.predict_generator(
            validation_generator,
            steps=validation_samples // batch_size + ((validation_samples % batch_size)>0),
            verbose = 1)
save_array(models_path+'validation_features_Input512.bc',validation_features)
save_array(models_path+'validation_labels_Input512.bc', validation_labels)

#Load them back in for now, so I can start from here...
train_features = load_array(models_path+'train_features_Input512.bc')
train_labels = load_array(models_path+'train_labels_Input512.bc')
validation_features = load_array(models_path+'validation_features_Input512.bc')
validation_labels = load_array(models_path+'validation_labels_Input512.bc')

#Set constants.
batch_size=8
learning_rate=0.0001
#Create the input layer for the new model
input = Input(shape=(14,14,512), name="Output_from_topmodel")

#Define the new layer/tensor for the new model
fullmodel = MaxPooling2D(name="MaxPooling_1")(input)
fullmodel = Flatten()(fullmodel)
fullmodel = Dense(4096, activation='relu', name='Fully_Connected_1')(fullmodel)
fullmodel = Dense(4096, activation='relu', name='Fully_Connected_2')(fullmodel)
fullmodel = Dense(2, activation='softmax', name='Binary_Activation')(fullmodel)

#Create the new model, with the last layer as input and the binary activation tensor as the output
fullmodel = Model(input, fullmodel, name='FullFromTOP_VGG16')

#Set all layers to trainable
for layer in fullmodel.layers[:]: layer.trainable=True

#Compile the new model
fullmodel.compile(optimizer=RMSprop(lr=learning_rate),
              loss='categorical_crossentropy', metrics=['accuracy'])

#Set a checkpoint function to save the model when it is BEST
# This not only saves weights, but also the optimizer states
checkpointer = ModelCheckpoint(filepath=results_path+'VGG16_dogsandcats_weights_TF-K2_fullmodel_BS8_Input512.h5',
               monitor = "val_acc", verbose=1, save_best_only=True, save_weights_only=False)

#Now fit the model
#fit(self, x=None, y=None, batch_size=32, epochs=1, verbose=1, callbacks=None,
#          validation_split=0.0, validation_data=None, shuffle=True, class_weight=None, sample_weight=None,
#          ainitial_epoch=0)
fullmodel.fit(
    x=train_features,
    y=train_labels,
    batch_size=batch_size,
    epochs=epochs,
    verbose=1,
    callbacks=[checkpointer],
    validation_data=(validation_features, validation_labels),
    shuffle=True)

Train on 22952 samples, validate on 2048 samples
Epoch 1/3
18376/22952 [=======================>......] - ETA: 43s - loss: 8.0941 - acc: 0.4978

I am guessing I am doing something obviously wrong but I have been struggling with this for two days now…

I have not issues with some of the other models and get up to 0.985 accuracy on some of them,but doing it this is way is stumping me…

Thanks!

Never mind…I had set my learning rate to 0.1 which was ofcourse way to high…

Setting learning rate to 1E-6 solved my issue and the model is training nicely now…

1 Like