Pseudo-Labeling section in Lesson 7 throwing a ValueError

Hi,

I’m trying to run the Pseudo-Labeling code present in the lesson 7 notebook. The exact line that’s causing the problem is this:

test_batches = gen.flow(conv_test_feat, preds, batch_size=16)
and this is the exact error:

ValueError: NumpyArrayIterator is set to use the dimension ordering convention "th" (channels on axis 1), i.e. expected either 1, 3 or 4 channels on axis 1. However, it was passed an array with shape (1000, 512, 14, 14) (512 channels).

I tried to google the error but nothing showed up. here conv_test_feat is the output of the pre-computed conv layers on test_batches.

Thanks!

Keras assumes your data consists of normal images for the flow method. You’re trying to do it on 512 channel output of your pretrained network when it expects 1,3 or 4.

If you can fit the features into memory, just call Model.fit directly on the numpy arrays directly. Otherwise you’ll have to create a custom class or do something else to work with an out of memory array.

Hi @davecg
wait. so what do i pass into MixIterator. How would you implement pseudo-labeling?

test_batches = gen.flow(conv_test_feat, preds, batch_size=16)
train_batches  = gen.flow(conv_feat, preds, batch_size=16)
val_batches = gen.flow(conv_val_feat, preds, batch_size=16)

the above is then passed to the MixIterator class
mi = MixIterator([batches,test_batches,val_batches])

which is then used to fit the model. not the features directly. I’m talking explicitly about Pseudo-labelling in lesson 7. :slight_smile: Am I missing something?

thanks!

It looks like the image preprocessing script in Keras was updated in the last month and the NumpyArrayIterator now checks to make sure data looks like image data (warning you’re seeing). I’m guessing that’s new since the first part finished.

Probably easiest thing would be to copy the code from fchollet’s github for preprocessing.image and commenting out that section rather than importing directly from Keras. :slight_smile:

1 Like

You can also use dask.pydata.org.

Dask arrays work with model.fit, and you can put your data loading script in a dask.delayed function.

Running into the same problem, tried to use dask but am getting the same error. Where am I going astray?

import dask.array as da
#testms = da.from_array(conv_test_feat)
conv_test_feat = da.from_array(conv_test_feat,chunks=(1000, 512, 22, 40))
preds = da.from_array(preds, chunks=(100,8))

test_batches = gen.flow(conv_test_feat, preds, batch_size=16)

ValueError                                Traceback (most recent call last)
<ipython-input-247-ba8f45991d06> in <module>()
----> 1 test_batches = gen.flow(conv_test_feat, preds, batch_size=16)

/home/jd/anaconda3/lib/python3.5/site-packages/keras/preprocessing/image.py in flow(self, X, y, batch_size, shuffle, seed, save_to_dir, save_prefix, save_format)
    425             save_to_dir=save_to_dir,
    426             save_prefix=save_prefix,
--> 427             save_format=save_format)
    428 
    429     def flow_from_directory(self, directory,

/home/jd/anaconda3/lib/python3.5/site-packages/keras/preprocessing/image.py in __init__(self, x, y, image_data_generator, batch_size, shuffle, seed, dim_ordering, save_to_dir, save_prefix, save_format)
    688                              'either 1, 3 or 4 channels on axis ' + str(channels_axis) + '. '
    689                              'However, it was passed an array with shape ' + str(self.x.shape) +
--> 690                              ' (' + str(self.x.shape[channels_axis]) + ' channels).')
    691         if y is not None:
    692             self.y = np.asarray(y)

ValueError: NumpyArrayIterator is set to use the dimension ordering convention "th" (channels on axis 1), i.e. expected either 1, 3 or 4 channels on axis 1. However, it was passed an array with shape (1000, 512, 22, 40) (512 channels).