VGG16 fine tuning: low accuracy


(damien ng) #21

change it to python 2. My guess is keras has a problem with python 3.


(damien ng) #22

I tried with python 2. It is right for a moment then it come back just like python 3. It drives me crazy. I don’t know if they store model as cache or not but the different run fit_generator my loss is from 0.3 jump to 8.0.


#23

Hi,

I seem to have a similar issue with the cats&dogs VGG16 fine-tuning…

I use python3 with keras2 and tensorflow. I use the keras VGG model, which is built with the functional API instead of the Sequential.

First I instantiate the full VGG and replaced the final layer with a Dense(2). Training worked fine. But when I tried to re-train all dense layers the accuracy collapsed to 50%.

I thought that there is something, which I do not understand, and that prevents re-training of provided layers. So I instantiated only the conv part, and added the full dense part. But again re-training the last dense layer works, but re-training e.g. the last two dense layers lets the accuracy collapse.

I already tried small/big learning rates, and the Adam optimizer. I’m completely out of ideas.

Here my final code. Change a singe character and it works:
[:-2] -> [:-1] for setting the final layers learnable

%matplotlib inline
 import os
 from importlib import reload
 import utils3; reload(utils3)
 from utils3 import *

 import tensorflow as tf
 import keras.backend.tensorflow_backend as ktf
 def set_session_options(**kwargs):
    gpu_options = tf.GPUOptions(**kwargs)
    ktf.set_session(tf.Session(config=tf.ConfigProto(gpu_options=gpu_options)))
 set_session_options(allow_growth=True)

 path = "data/dogscats/"
 model_path = path + 'models/'
 if not os.path.exists(model_path): os.mkdir(model_path)

 batch_size=64

 val_batches = get_batches(path+'valid', shuffle=False, batch_size=batch_size)
 batches = get_batches(path+'train', shuffle=True, batch_size=batch_size)

 val_classes = val_batches.classes
 trn_classes = batches.classes
 val_labels = onehot(val_classes)
 trn_labels = onehot(trn_classes)

 opt = RMSprop(lr=0.1)
 def fit_model(model, batches, val_batches, nb_epoch=1):
  model.fit_generator(batches, steps_per_epoch=ceil(batches.n / batches.batch_size),
                      epochs=nb_epoch, 
                      validation_data=val_batches,
                      validation_steps=ceil(val_batches.n / val_batches.batch_size))

 opt = RMSprop(lr=0.1)
#opt = Adam(lr=0.00001)

 from keras.applications.vgg16 import VGG16
 model = VGG16(include_top=False, input_shape=(224, 224, 3))

 flatten = Flatten()
 x = flatten(model.output)
 x = Dense(4096, activation='relu')(x)
 x = Dense(4096, activation='relu')(x)
 x = Dense(2, activation='softmax')(x)
 model = Model(inputs=model.input, outputs=x)

 for layer in model.layers:
  layer.trainable=False
 for layer in model.layers[-2:]:
  layer.trainable=True

 model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])

 model.load_weights(model_path+'finetune1.h5')

 fit_model(model, batches, val_batches, 2)

#24

OK, I found the solution, at least in my case.

First you need to compile after setting trainable. I think I read the opposite in one of Jeremy’s notebooks. But maybe it was different with keras1.

Second this well trained model is still quite sensitive. I had to include the dropouts, AND reduce the learning rate. For Adam 1e-4 for training 2 dense layers, and 1e-5 for 3 dense layers.

That’s it. Hope it helps you too.


(houda kaddioui) #25

Hi, follow up question. is the position of the dropout important? Like would the dropout come between the two dense or after the dense and before the softmax?
Thanks!