Training accuracy goes to 50% when training on VGG top model, without averaging

thondeboer · September 6, 2017, 6:15am

My Training accuracy does not seem to get above 50% when I try to train the dogs vs cats model, when starting from an “intermediate” or “top” layer.

I am using the VGG16 model from Keras 2.0.8 with Tensorflow 1.3.
I created a TOP model with VGG and saved the features I got when training and validating. The top model just has the convolutional layers and not the fully connected (Dense) layers.
I also did NOT set the average option, so there is no averaging.

Then I create a NEW model that re-creates the last maxPooling layer and the Dense layers and I use the saved features as the input, but the training does not seem to get above 50%.
Surely I am mangling SOMETHING but the training not getting above 50% seems weird to me…

Here’s the code and results

# dimensions of our images.
img_width, img_height = 224, 224

#Set constants.
batch_size=8
epochs=3
learning_rate=0.01

#This seems to help with resource issues
limit_mem()

#Don't do this after you load all the weights
init = tf.global_variables_initializer()
sess = K.get_session()
sess.run(init)
#Top level of the data paths
DATA_HOME_DIR = 'D:/CODE/data/dogscats'

#This is where the temp files and models go
CACHE_SUBDIR = 'D:/CODE/Keras/models'

#Set path to sample/ path if desired
path = DATA_HOME_DIR #+ '/sample/'
test_path = DATA_HOME_DIR + '/test/'      #We use all the test data
results_path=DATA_HOME_DIR + '/results/'
models_path=path + '/models/'
train_path=path + '/train/'
valid_path=path + '/valid/'

#Create the vanilla VGG16 model WITHOUT the top and no pooling
topmodel = VGG16(weights='imagenet', include_top=False, pooling = None)
last_conv_idx = [index for index,layer in enumerate(topmodel.layers) 
                     if type(layer) is Conv2D][-1]
conv_model = Sequential(topmodel.layers[:last_conv_idx+1])
conv_model.compile(optimizer=RMSprop(lr=learning_rate),
              loss='categorical_crossentropy', metrics=['accuracy'])
#Show it!
conv_model.summary()

#No preprocessing at this point, except for preprocessing
train_datagen = ImageDataGenerator(preprocessing_function=preprocess_input)
validation_datagen = ImageDataGenerator(preprocessing_function=preprocess_input)
test_datagen = ImageDataGenerator(preprocessing_function=preprocess_input)

train_generator = train_datagen.flow_from_directory(
    train_path,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='categorical',
    shuffle = False)

train_filenames = train_generator.filenames
train_samples = len(train_filenames)
train_classes = train_generator.classes
train_labels = to_categorical(train_classes)

validation_generator = validation_datagen.flow_from_directory(
    valid_path,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='categorical',
    shuffle = False) #Need this to be false, so I can extract the correct classes and filenames in order that that are predicted

validation_filenames = validation_generator.filenames
validation_samples = len(validation_filenames)
validation_classes = validation_generator.classes
validation_labels = to_categorical(validation_classes)

train_features = conv_model.predict_generator(
            train_generator,
            steps=train_samples // batch_size + ((train_samples % batch_size)>0),
            verbose = 1)
save_array(models_path+'train_features_Input512.bc',train_features)
save_array(models_path+'train_labels_Input512.bc', train_labels)
validation_features = conv_model.predict_generator(
            validation_generator,
            steps=validation_samples // batch_size + ((validation_samples % batch_size)>0),
            verbose = 1)
save_array(models_path+'validation_features_Input512.bc',validation_features)
save_array(models_path+'validation_labels_Input512.bc', validation_labels)

#Load them back in for now, so I can start from here...
train_features = load_array(models_path+'train_features_Input512.bc')
train_labels = load_array(models_path+'train_labels_Input512.bc')
validation_features = load_array(models_path+'validation_features_Input512.bc')
validation_labels = load_array(models_path+'validation_labels_Input512.bc')

#Set constants.
batch_size=8
learning_rate=0.0001
#Create the input layer for the new model
input = Input(shape=(14,14,512), name="Output_from_topmodel")

#Define the new layer/tensor for the new model
fullmodel = MaxPooling2D(name="MaxPooling_1")(input)
fullmodel = Flatten()(fullmodel)
fullmodel = Dense(4096, activation='relu', name='Fully_Connected_1')(fullmodel)
fullmodel = Dense(4096, activation='relu', name='Fully_Connected_2')(fullmodel)
fullmodel = Dense(2, activation='softmax', name='Binary_Activation')(fullmodel)

#Create the new model, with the last layer as input and the binary activation tensor as the output
fullmodel = Model(input, fullmodel, name='FullFromTOP_VGG16')

#Set all layers to trainable
for layer in fullmodel.layers[:]: layer.trainable=True

#Compile the new model
fullmodel.compile(optimizer=RMSprop(lr=learning_rate),
              loss='categorical_crossentropy', metrics=['accuracy'])

#Set a checkpoint function to save the model when it is BEST
# This not only saves weights, but also the optimizer states
checkpointer = ModelCheckpoint(filepath=results_path+'VGG16_dogsandcats_weights_TF-K2_fullmodel_BS8_Input512.h5',
               monitor = "val_acc", verbose=1, save_best_only=True, save_weights_only=False)

#Now fit the model
#fit(self, x=None, y=None, batch_size=32, epochs=1, verbose=1, callbacks=None,
#          validation_split=0.0, validation_data=None, shuffle=True, class_weight=None, sample_weight=None,
#          ainitial_epoch=0)
fullmodel.fit(
    x=train_features,
    y=train_labels,
    batch_size=batch_size,
    epochs=epochs,
    verbose=1,
    callbacks=[checkpointer],
    validation_data=(validation_features, validation_labels),
    shuffle=True)

Train on 22952 samples, validate on 2048 samples
Epoch 1/3
18376/22952 [=======================>......] - ETA: 43s - loss: 8.0941 - acc: 0.4978

I am guessing I am doing something obviously wrong but I have been struggling with this for two days now…

I have not issues with some of the other models and get up to 0.985 accuracy on some of them,but doing it this is way is stumping me…

Thanks!

thondeboer · September 6, 2017, 5:44pm

Never mind…I had set my learning rate to 0.1 which was ofcourse way to high…

Setting learning rate to 1E-6 solved my issue and the model is training nicely now…