My Training accuracy does not seem to get above 50% when I try to train the dogs vs cats model, when starting from an “intermediate” or “top” layer.
I am using the VGG16 model from Keras 2.0.8 with Tensorflow 1.3.
I created a TOP model with VGG and saved the features I got when training and validating. The top model just has the convolutional layers and not the fully connected (Dense) layers.
I also did NOT set the average option, so there is no averaging.
Then I create a NEW model that re-creates the last maxPooling layer and the Dense layers and I use the saved features as the input, but the training does not seem to get above 50%.
Surely I am mangling SOMETHING but the training not getting above 50% seems weird to me…
Here’s the code and results
# dimensions of our images.
img_width, img_height = 224, 224
#Set constants.
batch_size=8
epochs=3
learning_rate=0.01
#This seems to help with resource issues
limit_mem()
#Don't do this after you load all the weights
init = tf.global_variables_initializer()
sess = K.get_session()
sess.run(init)
#Top level of the data paths
DATA_HOME_DIR = 'D:/CODE/data/dogscats'
#This is where the temp files and models go
CACHE_SUBDIR = 'D:/CODE/Keras/models'
#Set path to sample/ path if desired
path = DATA_HOME_DIR #+ '/sample/'
test_path = DATA_HOME_DIR + '/test/' #We use all the test data
results_path=DATA_HOME_DIR + '/results/'
models_path=path + '/models/'
train_path=path + '/train/'
valid_path=path + '/valid/'
#Create the vanilla VGG16 model WITHOUT the top and no pooling
topmodel = VGG16(weights='imagenet', include_top=False, pooling = None)
last_conv_idx = [index for index,layer in enumerate(topmodel.layers)
if type(layer) is Conv2D][-1]
conv_model = Sequential(topmodel.layers[:last_conv_idx+1])
conv_model.compile(optimizer=RMSprop(lr=learning_rate),
loss='categorical_crossentropy', metrics=['accuracy'])
#Show it!
conv_model.summary()
#No preprocessing at this point, except for preprocessing
train_datagen = ImageDataGenerator(preprocessing_function=preprocess_input)
validation_datagen = ImageDataGenerator(preprocessing_function=preprocess_input)
test_datagen = ImageDataGenerator(preprocessing_function=preprocess_input)
train_generator = train_datagen.flow_from_directory(
train_path,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='categorical',
shuffle = False)
train_filenames = train_generator.filenames
train_samples = len(train_filenames)
train_classes = train_generator.classes
train_labels = to_categorical(train_classes)
validation_generator = validation_datagen.flow_from_directory(
valid_path,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='categorical',
shuffle = False) #Need this to be false, so I can extract the correct classes and filenames in order that that are predicted
validation_filenames = validation_generator.filenames
validation_samples = len(validation_filenames)
validation_classes = validation_generator.classes
validation_labels = to_categorical(validation_classes)
train_features = conv_model.predict_generator(
train_generator,
steps=train_samples // batch_size + ((train_samples % batch_size)>0),
verbose = 1)
save_array(models_path+'train_features_Input512.bc',train_features)
save_array(models_path+'train_labels_Input512.bc', train_labels)
validation_features = conv_model.predict_generator(
validation_generator,
steps=validation_samples // batch_size + ((validation_samples % batch_size)>0),
verbose = 1)
save_array(models_path+'validation_features_Input512.bc',validation_features)
save_array(models_path+'validation_labels_Input512.bc', validation_labels)
#Load them back in for now, so I can start from here...
train_features = load_array(models_path+'train_features_Input512.bc')
train_labels = load_array(models_path+'train_labels_Input512.bc')
validation_features = load_array(models_path+'validation_features_Input512.bc')
validation_labels = load_array(models_path+'validation_labels_Input512.bc')
#Set constants.
batch_size=8
learning_rate=0.0001
#Create the input layer for the new model
input = Input(shape=(14,14,512), name="Output_from_topmodel")
#Define the new layer/tensor for the new model
fullmodel = MaxPooling2D(name="MaxPooling_1")(input)
fullmodel = Flatten()(fullmodel)
fullmodel = Dense(4096, activation='relu', name='Fully_Connected_1')(fullmodel)
fullmodel = Dense(4096, activation='relu', name='Fully_Connected_2')(fullmodel)
fullmodel = Dense(2, activation='softmax', name='Binary_Activation')(fullmodel)
#Create the new model, with the last layer as input and the binary activation tensor as the output
fullmodel = Model(input, fullmodel, name='FullFromTOP_VGG16')
#Set all layers to trainable
for layer in fullmodel.layers[:]: layer.trainable=True
#Compile the new model
fullmodel.compile(optimizer=RMSprop(lr=learning_rate),
loss='categorical_crossentropy', metrics=['accuracy'])
#Set a checkpoint function to save the model when it is BEST
# This not only saves weights, but also the optimizer states
checkpointer = ModelCheckpoint(filepath=results_path+'VGG16_dogsandcats_weights_TF-K2_fullmodel_BS8_Input512.h5',
monitor = "val_acc", verbose=1, save_best_only=True, save_weights_only=False)
#Now fit the model
#fit(self, x=None, y=None, batch_size=32, epochs=1, verbose=1, callbacks=None,
# validation_split=0.0, validation_data=None, shuffle=True, class_weight=None, sample_weight=None,
# ainitial_epoch=0)
fullmodel.fit(
x=train_features,
y=train_labels,
batch_size=batch_size,
epochs=epochs,
verbose=1,
callbacks=[checkpointer],
validation_data=(validation_features, validation_labels),
shuffle=True)
Train on 22952 samples, validate on 2048 samples
Epoch 1/3
18376/22952 [=======================>......] - ETA: 43s - loss: 8.0941 - acc: 0.4978
I am guessing I am doing something obviously wrong but I have been struggling with this for two days now…
I have not issues with some of the other models and get up to 0.985 accuracy on some of them,but doing it this is way is stumping me…
Thanks!