Predict_generator output shape

atk0 · May 21, 2017, 6:10pm

Hi,
I’m seeing some strange behavior with the output shape from predict_generator on my local machine. I’m working on the fisheries competition, and using one of our standard approaches - take the convolution layers from the pre-trained VGG model, and then feed the outputs to train a new set of Dense layers.

Just for simplicity, my sample training data has 16 total samples across 8 classes (2 samples per class).

Here is my code:
train_dir = DATA_DIR + '\sample\train’ batch_size = 16 batches = get_batches(train_dir, batch_size=batch_size, shuffle=False) last_conv_idx = [idx for idx,layer in enumerate(model.layers) if(type(layer) is Convolution2D)][-1] conv_layers = model.layers[:last_conv_idx+1] conv_model = Sequential(conv_layers) trn_features = conv_model.predict_generator(batches, batches.samples, verbose=1)

Here is what I get when I run trn_features.shape:
(256L, 512L, 14L, 14L)

I would have expected this to be: (16L, 512L, 14L, 14L)

It looks like predict_generator’s output shape is total samples * batch size?

This has been driving me crazy - what am I doing wrong? I used the above code for the state farm competition but didn’t see this issue.

atk0 · May 21, 2017, 6:24pm

Just to add, I am using Keras 2.0 on my local machine.

atk0 · May 21, 2017, 8:13pm

Further update - I ran statefarm on AWS. When I transferred this fisheries notebook to AWS, it also ran fine and as expected. I believe the AWS AMI image uses Keras 1.x.

So this looks to be a change in behavior with Keras 2.0 - I’ll dig a bit more to see what can be done to get the expected behavior here.

sivark · May 24, 2017, 6:03pm

I just encountered the exact same problem. This is due to breaking changes in Keras 2.0 (see changelog at: https://github.com/fchollet/keras/releases/tag/2.0.0)

EDIT: Keras’ ImageDataGenerator (doc, code) seems purposed towards generating an infinite number of images, on loop. While this might be convenient for training/validation data, since it will only generate an integer number of batches, I wonder whether the best way out would be to write a custom (batched) generator for test images.