Error when checking model target: expected predictions to have shape (None, 1000) but got array with shape (64, 2)


Notebook so far: Notebook

I am trying to solve the homework for Lesson 1 by using the standard VGG16 built into Keras.

I had to recreate the pop() and add() functions to remove the last Dense(1000) layer and replace it with Dense(2) layer.

However, when I try to use the fit_generator function, I get the following error:

ValueError: Error when checking model target: expected predictions to have shape (None, 1000) but got array with shape (64, 2)

It sounds like my model is still expecting to output 1000 categories rather than 2. Why is this?

Model summary below:

Layer (type) Output Shape Param # Connected to

input_27 (InputLayer) (None, 224, 224, 3) 0

block1_conv1 (Convolution2D) (None, 224, 224, 64) 1792 input_27[0][0]

block1_conv2 (Convolution2D) (None, 224, 224, 64) 36928 block1_conv1[0][0]

block1_pool (MaxPooling2D) (None, 112, 112, 64) 0 block1_conv2[0][0]

block2_conv1 (Convolution2D) (None, 112, 112, 128) 73856 block1_pool[0][0]

block2_conv2 (Convolution2D) (None, 112, 112, 128) 147584 block2_conv1[0][0]

block2_pool (MaxPooling2D) (None, 56, 56, 128) 0 block2_conv2[0][0]

block3_conv1 (Convolution2D) (None, 56, 56, 256) 295168 block2_pool[0][0]

block3_conv2 (Convolution2D) (None, 56, 56, 256) 590080 block3_conv1[0][0]

block3_conv3 (Convolution2D) (None, 56, 56, 256) 590080 block3_conv2[0][0]

block3_pool (MaxPooling2D) (None, 28, 28, 256) 0 block3_conv3[0][0]

block4_conv1 (Convolution2D) (None, 28, 28, 512) 1180160 block3_pool[0][0]

block4_conv2 (Convolution2D) (None, 28, 28, 512) 2359808 block4_conv1[0][0]

block4_conv3 (Convolution2D) (None, 28, 28, 512) 2359808 block4_conv2[0][0]

block4_pool (MaxPooling2D) (None, 14, 14, 512) 0 block4_conv3[0][0]

block5_conv1 (Convolution2D) (None, 14, 14, 512) 2359808 block4_pool[0][0]

block5_conv2 (Convolution2D) (None, 14, 14, 512) 2359808 block5_conv1[0][0]

block5_conv3 (Convolution2D) (None, 14, 14, 512) 2359808 block5_conv2[0][0]

block5_pool (MaxPooling2D) (None, 7, 7, 512) 0 block5_conv3[0][0]

flatten (Flatten) (None, 25088) 0 block5_pool[0][0]

fc1 (Dense) (None, 4096) 102764544 flatten[0][0]

fc2 (Dense) (None, 4096) 16781312 fc1[0][0]

predictions (Dense) (None, 2) 8194 fc2[0][0]

Total params: 134,268,738
Trainable params: 8,194
Non-trainable params: 134,260,544

The .add() function sets the model.built variable to False, so I am wondering if it’s anything to do with that. And if it is, how do I “build” the model? Any help is greatly appreciated.

Perhaps @jeremy would have a clue?

Could you check the ~/.keras/keras.json file and check, whether you have "image_dim_ordering": "th" set properly?

For whatever reason your model still thinks its output is the original Dense(1000) layer.

Did you do model.compile(**kwargs) after removing and adding layers?

You may also want to check what you have for model.output and model.output_layers (not 100% sure those are present for Sequential models since I’ve mainly been using the functional API).

You could use the functional API like this:

from keras.models import Model

layer = model.get_layer(name='fc2')

output = Dense(2, name='predictions', **etc)(layer.output)

new_model = Model(input=model.input, output=output)


Hi David,

Thanks for coming back. That seems to have worked, but I would appreciate it if you could explain what you just did.

output = Dense(2, name=‘predictions’, **etc)(layer.output)

I don’t fully understand the syntax for (layer.output). Does this mean take the output shape of layer.output and use it as the input shape for the new Dense layer?

I also don’t fully grasp how model.input ~Wouldn’t this just be the input layer? So one layer?~ + the Dense layer I just created = a 23 layer model!

Check out new_model.summary() and try plotting the graph using Keras visualization utilities to get a better feel for what is going on.

You can think of your model (much like the Internet) as a series of tubes (I.e. The computational graph). You’re pouring data into the input of the first model, but diverting the output of an intermediate layer into your new dense layer.

If you call a layer as a function on a tensor, i.e. layer()(input), it will give you its output tensor.

So by saying new_model = Model(input=model.input, output=output), you’re saying that the input for your new model is the same input point as the original model and data will flow through the graph of the original model until you divert it to your new layer (which generates your new output tensor).

If that didn’t clear it up enough, you should also check the Keras functional API docs.


That’s very useful, thank you!