Batch Normalization - Fine tune VGG model - Low accuracy when tuning the conv layer

(Hamid Khakpour) #1

I created the Vgg model using keras with tensorflow backend and theano image ordering. For fully connected layers, I added the batch normalization after the relu activation layers. I loaded the vgg16_bn.h5 weights of the model. Then I pop the last 8 layers up to the maxpooling laye in conv5 block “model.add(MaxPooling2D((2,2), strides=(2,2)))”.

model.add(Dense(4096, activation='relu'))
model.add(Dense(4096, activation='relu'))
model.add(Dense(1000, activation='softmax'))

I pre-compute the features up to there and save them. Then I created a top_model based on my problem and load the precompute features to the top_model and trained the model. I got 88% accuracy after 20 epochs on validation set and 99% accuracy on the training set. Then I saved the weights for the top model to a file.

Now, I want to fine tune the model (train the last conv block as well). So I have the vgg model with batch normalization only on fully connected layers, I loadded the vgg_bn weights and poped the last 8 layers. Then created my top_model and load the weights based on the previous study and add my model to the vgg_bn model without last 8 layers. Then for the first 25 layers, I setup the trainable to be False.

for layer in model.layers[:25]:
layer.trainable = False

Without doing any training, If I evaluate my validation_generator, I get 88% accuracy. Now I let the model to be trained for 1 epoch and my accuracy is 20% on training and 36% on validation. Even if I train for 40 epochs it does not help. I appreciate any help.

(魏璎珞) #2

This might be helpful

(Hamid Khakpour) #3

I do not have the batch normalization in Conv block layers. I only have them after my activation layers on fully connected block. Thanks for the reply.

(魏璎珞) #4

btw, i’m interested to know why you selected to use vgg for your problem.