Lesson 2: Can't improve on 92% validation accuracy on Cats vs Dogs


In the Lesson 2 video, it appears that Jeremy is achieving approximately 98% validation accuracy. I’m not able to achieve greater than 92% accuracy. Further, no matter how much I train (with variable learning rates), my training loss/accuracy fails to improve.

Might you have any advice on how I can improve my result? I’ll provide context below.

Environment: I"m running on my own GTX 1060 that can handle a maximum batch size of 32.

Variables that I have explored:

  • Batch size (ranging from 8 to 32)
  • Learning rate (from {0.1, 0.05, 0.01, 0.005, 0.001})
  • Increased training (up to 15 epochs)

No matter the variable explored, I top out at 92%.

Because we create the validation set at random, I thought it might be possible that I had randomly generated a challenging validation set. However, I attempted to re-seed and generate a new validation set only to find the same plateau in training/validation accuracy.

For context, I provide my code. I manually add the “fine-tuning”:

from vgg16 import Vgg16
vgg = Vgg16()
for layer in vgg.model.layers:
    layer.trainable = False
vgg.model.add(Dense(2, activation='softmax'))

Create the batches with:

batch_size = 24
batches = get_batches(train_path, batch_size=batch_size, shuffle=True)
val_batches = get_batches(valid_path, batch_size=batch_size, shuffle=True)

And, I run training intervals with the following function that both calls vgg.model.compile() (to set the learning rate) and vgg.model.fit_generator() (to fit the training data):

def run_epochs(last_epoch, no_of_epochs, learning_rate):
    vgg.model.compile(optimizer=RMSprop(lr=learning_rate), loss='categorical_crossentropy', metrics=['accuracy'])
    print "Running %d additional epochs with lr=%f." % (no_of_epochs, learning_rate)
    latest_weights_filename = (results_path + 'ft%d.h5' % last_epoch) if last_epoch > 0 else None
    print "Loading weights: %s" % latest_weights_filename
    if latest_weights_filename:
    for epoch in range(last_epoch+1, last_epoch + no_of_epochs + 1):
        print "Running epoch: %d" % epoch
            batches, samples_per_epoch=batches.n, nb_epoch=1, 
            validation_data=val_batches, nb_val_samples=val_batches.n)
        latest_weights_filename = results_path + 'ft%d.h5' % epoch
        print "Saved weights: %s" % latest_weights_filename
    print "Completed %s fit operations" % no_of_epochs

Any feedback would be greatly appreciated.

(Pavel Surmenok) #2

Have you resolved the issue?
I get similar results. I run training with default learning rate, batch size 64. I get about 92% validation accuracy.

Epoch 1: val_loss: 0.2328 - val_acc: 0.9160
Epoch 2: val_loss: 0.2374 - val_acc: 0.9125
Epoch 3: val_loss: 0.2013 - val_acc: 0.9265
Epoch 4: val_loss: 0.2018 - val_acc: 0.9270
Epoch 5: val_loss: 0.1902 - val_acc: 0.9280
Epoch 6: val_loss: 0.2072 - val_acc: 0.9290

Comparing it with the output saved in dogs_cats_redux.ipynb notebook on GitHub:

val_loss: 0.2205 - val_acc: 0.9825

My validation accuracy is much worse than theirs. But validation loss is better.

As I use same data (though could be split into train/validation differently because of randomization) and same code, getting different results is weird.

(Pavel Surmenok) #3

I think I found the cause of this issue. I was using TensorFlow as a backend, because it is the default for Keras now, and I didn’t expect a backend to impact results.
After I set these options in ~/.keras/keras.json:

“image_dim_ordering”: “th”
“backend”: “theano”

I get much better loss and accuracy:

val_loss: 0.0348 - val_acc: 0.9885

It’s interesting what’s so different between TensorFlow and Theano that it impacts results so much.


Thanks for sharking your results, @surmenok. I had not resolved this issue myself and was frustrated not to figure out the root of the problem. I was indeed using TensorFlow as the backend, not Theano. I’ll go back and check to see if I get a congruent improvement.

It would be great to get feedback from a veteran on here: Why would the switch in backend (from TensorFlow to Theano) lead to such a significant boost in results? It’s a bit disconcerting to see such a difference without understanding why. Could it be the case that for a different problem the TensorFlow backend will out-produce the Theano backend?

Any takers?

(Vaisakh) #5

In TensorFlow, the dimensions of an image tensor are (batches, height, width, channels) wheras in Theano, the dimensions are (batches, channels, height, width). These conventions are usually referred to as channels_last and channels_first respectively.

So, if you’ve ordered the image data with channels_first (which you have, assuming you’re following the lessons), the backend should be set to Theano and image_dim_ordering to 'th'.

By default, Keras uses the TensorFlow backend and image_dim_ordering 'tf'.
As such, it expects that the last axis of the tensor has the color channels.

TL;DR: Tensorflow expects something, you gave it something else :wink:

(Pavel Surmenok) #6

I don’t think dimension ordering is an issue here. I tried both ‘th’ and ‘tf’, and in both cases got accuracy around 92%. I think it’s more likely that the weights should be converted to reflect difference in implementation of convolutional layers between Theano and TensorFlow, according to this article. But I haven’t tried that.

(Vaisakh) #7

Thanks, Pavel :slight_smile:

I didn’t know that the implementation was different in both backends.
I’ll check on the article you linked to.

Although, I’m puzzled why the accuracy does stay the same.
Should be different since the dimension ordering is different.


Thank you very much for the explanation, @svaisakh. I don’t think I would have figured that out on my own.


Just to confirm, when I switched to the Theano backend with image_dim_ordering set to "th", I did indeed see an immediate validation accuracy improvement (> 98.4% within 2 epochs of training as compared to ~92% validation accuracy with endless training).

(Daniel Elenius) #10

I have the same issue. 92% on TF, 98% on Keras. It’s not because of wrong dimension ordering.
The code ignores what ordering you set in keras.json, because vgg16.py has this line that sets it manually:


If you change that to ‘tf’ when running on tensorflow, you get ~50% accuracy, i.e. random guesses, as you would expect when running with completely messed up tensors! :slight_smile:

I tried the suggestion from @surmenok above, but after that, I got the “50% problem” even with “th” ordering, so something must have gone wrong there…

There is also more discussion and scripts here, but those require me to first create both a theano and tensorflow model from scratch, rather than just processing the model file, and I’m not sure how I’m supposed to do that.

Did anyone get to 98% with TF, with the plain dogs_cats_redux worksheet?

(Brian Park) #11

It turned out that keras 2.x has handy function that will convert models that was trained using Theano to TensorFlow.

from keras.utils import convert_all_kernels_in_model

Calling that function on the model after loading the weight was all that was necessary. I plan on updating the code and publish it on github. I’ll post the link here when it is ready.

(Brian Park) #12

I wrote a blog post describing how to get the notebooks from the first class to run with Keras2 with TensorFlow backend running on AMD GPU using ROCm.

Hope this helps some one here.

(Souraj) #13

Hi, did anyone was able to resolve this. I am using tensorflow backend , but am not able to get 98% result.
If we have to run the code with dimension ordering of tensorflow, I suspect some changes have to be made
One, I think code needs to be changed at preprocessing step as,
vgg_mean = np.array([123.68, 116.779, 103.939], dtype=np.float32).reshape((1,1,3)).

I made the change with no luck. If any have has run with with tensorflow backend with changing the image ordering(keeping it as for example (224,224,3)) , would be great if you an suggest a pointer.