Poor validation accuracy when splitting model into conv_model and fc_model

I am able to get the basic vgg_bn running. However, when I try to split the original model into conv_model and fc_model, the validation accuracy is terrible [This is the case with Dogs vs Cats as well as the Fisheries competitions)
Here’s the code that I implemented


conv_layers,fc_layers = split_at(model, Convolution2D)

conv_model = Sequential(conv_layers)

conv_feat = conv_model.predict_generator(batches, batches.nb_sample)
conv_val_feat = conv_model.predict_generator(val_batches, val_batches.nb_sample)

def get_bn_layers§:
return [
Dense(512, activation=‘relu’),
Dense(512, activation=‘relu’),
Dense(8, activation=‘softmax’)


bn_model = Sequential(get_bn_layers§)
bn_model.compile(Adam(lr=0.0001), loss=‘categorical_crossentropy’, metrics=[‘accuracy’])

bn_model.fit(conv_feat, trn_labels, batch_size=batch_size, nb_epoch=15,
validation_data=(conv_val_feat, val_labels))

Also, here’s how I created my data for initial use:

batches = get_batches(path+‘train’, shuffle= True, batch_size = batch_size) #true
val_batches = get_batches(path+‘valid’, shuffle = False, batch_size = batch_size) #false

val_classes = val_batches.classes
trn_classes = batches.classes
val_labels = onehot(val_classes)
trn_labels = onehot(trn_classes)

Please note: I tried shuffle = True for both train/validation batches and shuffle = false for both as well.
Regardless the accuracy is bad for the validation set

1 Like

If you are doing that, make sure you call reset() on the batches. There should be a method on batches, something along the lines of batch_index, so if you have used the generator before and your batches.nb_sample doesn’t cleanly divide by your batch size, you will have labels and features misaligned (I think it is a good idea to call the reset() even if those values align, I think keras might be doing something funny there anyhow based on what I have seen but not sure).

@radek, thanks for your reply. I did not understand what you mean and specifically how to do it.

Independently, I created using “get_batches”, the training and the validation batches, just before using the split models. That did not help either. To give a sense of how bad the accuracy is (for the fisheries dataset):

  1. using vgg_bn model, and 5 epochs, the numbers are:

Epoch 1/5
3327/3327 [==============================] - 104s - loss: 2.8963 - acc: 0.4740 - val_loss: 1.5522 - val_acc: 0.6267
Epoch 2/5
3327/3327 [==============================] - 98s - loss: 1.6289 - acc: 0.6345 - val_loss: 0.9716 - val_acc: 0.7444
Epoch 3/5
3327/3327 [==============================] - 94s - loss: 1.1984 - acc: 0.7142 - val_loss: 0.7352 - val_acc: 0.8000
Epoch 4/5
3327/3327 [==============================] - 95s - loss: 1.0620 - acc: 0.7406 - val_loss: 0.4956 - val_acc: 0.8711
Epoch 5/5
3327/3327 [==============================] - 93s - loss: 0.9405 - acc: 0.7668 - val_loss: 0.4342 - val_acc: 0.8889

Whereas, after creating batches and splitting the model at the end of the convolution2D etc, and after running 50 epochs(!), the numbers are as below:

Epoch 43/50
3327/3327 [==============================] - 2s - loss: 0.4245 - acc: 0.8533 - val_loss: 3.8565 - val_acc: 0.2778
Epoch 44/50
3327/3327 [==============================] - 2s - loss: 0.4146 - acc: 0.8647 - val_loss: 3.7787 - val_acc: 0.2733
Epoch 45/50
3327/3327 [==============================] - 2s - loss: 0.4216 - acc: 0.8590 - val_loss: 3.8436 - val_acc: 0.3044
Epoch 46/50
3327/3327 [==============================] - 2s - loss: 0.4226 - acc: 0.8563 - val_loss: 3.7721 - val_acc: 0.2689
Epoch 47/50
3327/3327 [==============================] - 2s - loss: 0.3947 - acc: 0.8708 - val_loss: 3.9400 - val_acc: 0.2800
Epoch 48/50
3327/3327 [==============================] - 2s - loss: 0.3897 - acc: 0.8687 - val_loss: 3.8830 - val_acc: 0.2800
Epoch 49/50
3327/3327 [==============================] - 2s - loss: 0.3637 - acc: 0.8765 - val_loss: 3.9882 - val_acc: 0.2867
Epoch 50/50
3327/3327 [==============================] - 2s - loss: 0.3546 - acc: 0.8771 - val_loss: 3.8869 - val_acc: 0.2956

You can see that the validation accuracy hovers around 28-30%

Really tough to say what is going on. Assuming validation set has images that are similar to the ones in the train set there really is no good explanation for this disparity between results on the train set and the validation set.

A low score on the validation set but a good score on the training set could be indicative that you are overfitting the training set - that your model doesn’t learn anything useful that it can generalize to images it hasn’t seen. But I do not know how that could be possible given your setup. More likely is that there is something wrong with your validation set.

You could try recreating the validation set by predicting the conv features straight after you call get_batches. Other than that I think you might also try building a simpler model on top of the conv layers and see how this fares. Another thing you might want to consider is initially training for a couple of epochs with an even lower training rate, 1e-6 or something like that.

Sorry, not really sure what could be happening here.

can you try changing your dense model to layers like this:

def get_bn_layers(p):
    return [
        Dense(512, activation='relu'),
        Dense(512, activation='relu'),
        Dense(8, activation='softmax')

I’m not clear on what your 2nd FC layer is doing: BatchNormalization(axis=1) but the rest of the approach sounds like what most of us are doing.

hope that helps

@avina09 Since you are pre-calculating the output of the convolutional layers you should set shuffle to false when getting your training batches.


@torkku Thanks. I had the exact same problem and setting shuffle to false solved it.

I don’t understand why though. I thought the shuffling would happen at the time of calling get_batches (which in turn calls ImageGenerator). I thought the get_batches would randomly select images to construct the batches, but once constructed, the batches are ‘fixed’. Hence, when you call model.predict_generator, there is no more shuffling happening.

Can anyone perhaps help?

@rauten No probs. Been there, done that myself. This must be a common trap.

You are correct, I don’t think either predict_generator shuffles the batches.

However, when you train your model using one of the fit() methods, batches get shuffled every epoch.