Finetuning top layers of CNN using keras

I am trying to finetune the top layers of facenet as described here Building powerful image classification models using very little data.
I have the facenet_keras.h5 and facenet_keras_weights.h5 files in the github repo.

So I can load them into my script, but what convolutional base should I define? I cannot define any base for facenet.
They have used vgg16:

train_data_path = 'dataset_cfps/train'
validation_data_path = 'dataset_cfps/validation'
test_data_path = 'test'
img_width, img_height = 224, 224

# path to the model weights files.
weights_path = 'keras-facenet/weights'
top_model_weights_path = 'keras-facenet/model'

nb_train_samples = 1774
nb_validation_samples = 313
epochs = 50
batch_size = 16
# build the VGG16 network
model = applications.VGG16(weights='imagenet', include_top=False)
print('Model loaded.')

Also, they have trained their model by freezing the first 25 layers.

# build a classifier model to put on top of the convolutional model
top_model = Sequential()
top_model.add(Dense(256, activation='relu'))
top_model.add(Dense(1, activation='sigmoid'))

# note that it is necessary to start with a fully-trained
# classifier, including the top classifier,
# in order to successfully do fine-tuning

# add the model on top of the convolutional base

# set the first 25 layers (up to the last conv block)
# to non-trainable (weights will not be updated)
for layer in model.layers[:25]:
    layer.trainable = False

How can I determine the last conv block for facenet? I cannot find it’s architecture or maybe it was difficult for me to understand upto how many layers I should not train the model. Can someone help me out in finding:

the right convoltional base
Deciding how many layers not to train.
I have 12 classes of facial images with 1774 training images and 313 valdaition images.

Please, check the code folder the file “” defines the whole model… to get the model load the facenet_keras file there you have the model and than the weights… after that freeze the conv layers and train classifier layers with your data… If you want to use keras try the old MOOC videos…

1 Like

Hey, thank you for replying.
I tried to load the facenet-keras.h5 file directly and the summary of model shows the following layers in the last in the model. I am getting an error on the line where I get the last layer of a base model and add a global spatial average pooling layer. I am not sure but maybe the error seems to be with the input to the GlobalAveragePooling2D() layer.

How do I reshape the output layer to fit in GlobalAveragePooling2D() layer. The last layers look like:

Dropout (Dropout)               (None, 1792)         0           AvgPool[0][0]                    
Bottleneck (Dense)              (None, 128)          229376      Dropout[0][0]                    
Bottleneck_BatchNorm (BatchNorm (None, 128)          384         Bottleneck[0][0]             

Here is my code:

img_width, img_height = 200, 200

# path to the model weights files.
weights_path = 'keras-facenet/weights/facenet_keras_weights.h5'
top_model_weights_path = 'keras-facenet/model/facenet_keras.h5'

base_model = load_model(top_model_weights_path)


x = base_model.output

x = GlobalAveragePooling2D()(x)

x = Dense(256, activation='relu')(x)

predictions = Dense(12, activation='softmax')(x)

model = Model(inputs=base_model.input, outputs=predictions)

for layer in base_model.layers:

    layer.trainable = False

The Error I get is:

    ValueError                                Traceback (most recent call last)
    <ipython-input-24-4a1c01c5761e> in <module>()
          5 x = base_model.output
    ----> 6 x = GlobalAveragePooling2D()(x)
          7 x = Dense(256, activation='relu')(x)
          8 predictions = Dense(12, activation='softmax')(x)

    /usr/local/lib/python3.6/dist-packages/keras/engine/ in __call__(self, inputs, **kwargs)
        573                 # Raise exceptions in case the input is not compatible
        574                 # with the input_spec specified in the layer constructor.
    --> 575                 self.assert_input_compatibility(inputs)
        577                 # Collect input shapes to build layer.

    /usr/local/lib/python3.6/dist-packages/keras/engine/ in assert_input_compatibility(self, inputs)
        472                             + ': expected ndim=' +
        473                                      str(spec.ndim) + ', found ndim=' +
    --> 474                                      str(K.ndim(x)))
        475             if spec.max_ndim is not None:
        476                 ndim = K.ndim(x)

    **ValueError: Input 0 is incompatible with layer global_average_pooling2d_2: expected ndim=4, found ndim=2**

you should first load the Inception_ResNet_v1 check the code for ref and pretrained data…

I added the weights path given in my code above. I have checked they are correctly placed and downloaded from keras-facenet repo on github:

   base_model =  InceptionResNetV1(input_shape=(224, 224, 3),
                          weights_path= weights_path)

    top_model = Sequential()
    top_model.add(Flatten(input_shape= (img_width, img_height, 3)))
    top_model.add(Dense(256, activation='relu'))
    top_model.add(Dense(12, activation='sigmoid'))


    for layer in base_model.layers[:422]:
        layer.trainable = False

I get this error because the weights do not match. How can I get past this?

InvalidArgumentError                      Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ in _create_c_op(graph, node_def, inputs, control_inputs)
   1566   try:
-> 1567     c_op = c_api.TF_FinishOperation(op_desc)
   1568   except errors.InvalidArgumentError as e:

InvalidArgumentError: Dimension 1 in both shapes must be equal, but are 12 and 128. Shapes are [1792,12] and [1792,128]. for 'Assign_1460' (op: 'Assign') with input shapes: [1792,12], [1792,128].

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-13-1e88430b9e07> in <module>()
      3                       classes=12,
      4                       dropout_keep_prob=0.8,
----> 5                       weights_path= weights_path)
      7 # base_model = load_model(top_model_weights_path)

<ipython-input-10-8d164826b2b0> in InceptionResNetV1(input_shape, classes, dropout_keep_prob, weights_path)
    208     model = Model(inputs, x, name='inception_resnet_v1')
    209     if weights_path is not None:
--> 210         model.load_weights(weights_path)
    212     return model

/usr/local/lib/python3.6/dist-packages/keras/engine/ in load_weights(self, filepath, by_name, skip_mismatch, reshape)
   2665             else:
   2666                 load_weights_from_hdf5_group(
-> 2667                     f, self.layers, reshape=reshape)
   2669     def _updated_config(self):

/usr/local/lib/python3.6/dist-packages/keras/engine/ in load_weights_from_hdf5_group(f, layers, reshape)
   3391                              ' elements.')
   3392         weight_value_tuples += zip(symbolic_weights, weight_values)
-> 3393     K.batch_set_value(weight_value_tuples)

/usr/local/lib/python3.6/dist-packages/keras/backend/ in batch_set_value(tuples)
   2370                 assign_placeholder = tf.placeholder(tf_dtype,
   2371                                                     shape=value.shape)
-> 2372                 assign_op = x.assign(assign_placeholder)
   2373                 x._assign_placeholder = assign_placeholder
   2374                 x._assign_op = assign_op

/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/ in assign(self, value, use_locking)
    613       the assignment has completed.
    614     """
--> 615     return state_ops.assign(self._variable, value, use_locking=use_locking)
    617   def assign_add(self, delta, use_locking=False):

/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/ in assign(ref, value, validate_shape, use_locking, name)
    281     return gen_state_ops.assign(
    282         ref, value, use_locking=use_locking, name=name,
--> 283         validate_shape=validate_shape)
    284   return ref.assign(value, name=name)

/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/ in assign(ref, value, validate_shape, use_locking, name)
     58     _, _, _op = _op_def_lib._apply_op_helper(
     59         "Assign", ref=ref, value=value, validate_shape=validate_shape,
---> 60         use_locking=use_locking, name=name)
     61     _result = _op.outputs[:]
     62     _inputs_flat = _op.inputs

/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ in _apply_op_helper(self, op_type_name, name, **keywords)
    785         op = g.create_op(op_type_name, inputs, output_types, name=scope,
    786                          input_types=input_types, attrs=attr_protos,
--> 787                          op_def=op_def)
    788       return output_structure, op_def.is_stateful, op

/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ in create_op(self, op_type, inputs, dtypes, input_types, name, attrs, op_def, compute_shapes, compute_device)
   3390           input_types=input_types,
   3391           original_op=self._default_original_op,
-> 3392           op_def=op_def)
   3394       # Note: shapes are lazily computed with the C API enabled.

/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ in __init__(self, node_def, g, inputs, output_types, control_inputs, input_types, original_op, op_def)
   1732           op_def, inputs, node_def.attr)
   1733       self._c_op = _create_c_op(self._graph, node_def, grouped_inputs,
-> 1734                                 control_input_ops)
   1735     else:
   1736       self._c_op = None

/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ in _create_c_op(graph, node_def, inputs, control_inputs)
   1568   except errors.InvalidArgumentError as e:
   1569     # Convert to ValueError for backwards compatibility.
-> 1570     raise ValueError(str(e))
   1572   return c_op

ValueError: Dimension 1 in both shapes must be equal, but are 12 and 128. Shapes are [1792,12] and [1792,128]. for 'Assign_1460' (op: 'Assign') with input shapes: [1792,12], [1792,128].

InceptionResNetV1(input_shape=(224, 224, 3), classes=12, … here I think you should have classes=128…