Lesson 7 with vgg640.predict_generator error

Hi there,

I’m trying to make it through Lesson 7 on my home machine with 16GB of system memory and hitting errors at the “Larger Size” 640x360 image section where we load the images with trn = get_data(path+'train', (360,640)). This results in a memory error since I can’t fit all the images in system memory.

To try to get around this, the forums (sorry, lost the link) suggested using ImageDataGenerator flow_from_directory. But, those forum examples were using fit_generator, not predict_generator (as the code in Lesson 7 wants us to do).

So I tried this:

batch_size = 32

vgg640 = Vgg16BN((360, 640)).model
vgg640.input_shape, vgg640.output_shape
vgg640.compile(Adam(), 'categorical_crossentropy', metrics=['accuracy'])

gen = image.ImageDataGenerator()
trn_batches = gen.flow_from_directory(path+'train', target_size=(360, 640), batch_size=batch_size)
num_trn_batches = 3404 // batch_size
conv_trn_feat = vgg640.predict_generator(trn_batches, num_trn_batches)

and I hit this error where the batch_size in predict_generator seems to be getting confused. Where is 10 coming from?

Found 3404 images belonging to 8 classes.
Traceback (most recent call last):
  File "test_bug.py", line 23, in <module>
    conv_trn_feat = vgg640.predict_generator(trn_batches, num_trn_batches)
  File "/home/rallen/anaconda2/lib/python2.7/site-packages/keras/models.py", line 1012, in predict_generator
  File "/home/rallen/anaconda2/lib/python2.7/site-packages/keras/engine/training.py", line 1777, in predict_generator
    all_outs[i][processed_samples:(processed_samples + nb_samples)] = out
ValueError: could not broadcast input array from shape (32,512,22,40) into shape (10,512,22,40)

Has anyone hit and/or gotten past this issue?

Thanks in advance,


p.s. This looks very much like this open Keras issue.
p.s. I did notice this question, but I’m using the default Keras 1 install, not Keras 2.

1 Like

I think I found my mistake. Stepping through the predict_generator code it seems the num_trn_batches should be set to the full number of images (3404), not the number of batches. With that change, I don’t get the error.

num_trn_images = 3404
conv_trn_feat = vgg640.predict_generator(trn_batches, num_trn_images)