Python and keras questions and tips

(Jeremy Howard (Admin)) #28

You can use copy_layer, copy_layers, copy_weights, and copy_model from See the source to see how they work.

(garima.agarwal) #29


Does np_utils.to_categorical have the same result as OneHot Encoding?

I implemented DenseNet for a small data set that I have. In the reference implementation they used np_utils.to_categorical on CIFAR10 dataset to convert the labels to binary. I felt that with our OneHot encoding mechanism we achieved the same thing but would like to get expert opinion.

Its hard to tell since both return a 2 dimensional array with 1s and 0s.


(Jeremy Howard (Admin)) #30

I believe so - check the source to be sure though.

(garima.agarwal) #31

The code is different but the intent seems to be the same in to_categorical:

y = np.array(y, dtype=‘int’).ravel()
if not num_classes:
num_classes = np.max(y) + 1
n = y.shape[0]
categorical = np.zeros((n, num_classes))
categorical[np.arange(n), y] = 1
return categorical

One more question for you.

I wasn’t geting very good results on my DenseNet so I printed out my data. I am very suprised to see what get_data is doing.

My notebook code is as simple as :

val_data = get_data(path+'valid')

And here is the get_data code

def get_batches_portrait(dirname, gen=image.ImageDataGenerator(), shuffle=False, batch_size=4, class_mode='categorical'):
    return gen.flow_from_directory(dirname, target_size=(540,270),
            class_mode=class_mode, shuffle=shuffle, batch_size=batch_size)

def get_data(path):
    batches = get_batches_portrait(path, shuffle=False, batch_size=4, class_mode=None)
    return np.concatenate([ for i in range(batches.nb_sample)])

For some reason its taken my image and put some filter on it.

Any ideas what I might be missing?

(Jeremy Howard (Admin)) #32

Can you show the original image, to compare?

(garima.agarwal) #33

(David Gutman) #34

I would look at the maximum and minimum values of both images.

Images should generally be in [0,1] if float or [0,256) if int. And make sure no nans.

You also may have switched RGB order to BGR if you were using VGG, so if you did that make sure you also undo it when viewing the images. (img[…,::-1])

(garima.agarwal) #35

Thanks David!
Its weird.

I created a separate folder and ran this code

def get_data(path):
    gen = image.ImageDataGenerator()
    batches = gen.flow_from_directory(path, target_size=(540,270),
            class_mode=None, shuffle=False, batch_size=2 )
    result  = np.concatenate([ for i in range(batches.nb_sample)])
    return result

val_data = get_data(path+'imgflip')

plt.imshow(val_data[0]) Shows the negative but

plt.imshow(val_data[0]*255) shows the correct image. Even though if I printed val_data[0] itself it shows values above 1

array([[[ 133., 131., 134.],
[ 127., 126., 131.],
[ 123., 124., 129.],
[ 113., 117., 126.],
[ 108., 115., 125.],
[ 99., 107., 118.],
[ 91., 101., 111.],
[ 88., 101., 110.],
[ 82., 95., 104.],
[ 76., 89., 98.],
[ 68., 78., 87.],
[ 60., 69., 74.],

I am going to train my model again on this muiltiplied number because I am not sure if the negative values will impact the model.

Let me know if you have any thoughts

(Jeremy Howard (Admin)) #36

You need to use


It’s just a plotting issue - unrelated to modeling.

(garima.agarwal) #37

:disappointed: Thats embarrassing.

thank you


This is exactly what happens in the dogscats-ensemble notebook when the function split_at is called

conv_layers,fc_layers = split_at(model, Convolution2D)
In [9]:

conv_model = Sequential(conv_layers)
Layer (type)                     Output Shape          Param #     Connected to                     
lambda_1 (Lambda)                (None, 3, 224, 224)   0           lambda_input_1[0][0]             
zeropadding2d_1 (ZeroPadding2D)  (None, 3, 226, 226)   0           lambda_1[0][0]                   
convolution2d_1 (Convolution2D)  (None, 64, 224, 224)  1792        zeropadding2d_1[0][0]            

(Jeremy Howard (Admin)) #39

Yeah I noticed that too. It doesn’t seem to have any practical impact AFAICT.


I feel a little uncomfortable with these “connected to” layers especially if you run the same cells over and over again in your notebook it adds a new “connected to” each time. It may have unexpected side effects right now or in a future keras version. Therefore much better to use the copy_layer functions.

Two related issues I have come across:

  • In the lessons we append a sequential model to another. But then when you list the layers the sequential layer appears as one layer so you cannot directly see what is in it. I think better to use copy_layers again so you can view the resulting model.
  • If you save models and then read them back in; and then combine with layers from other models then sometimes you end up with layer name clashes. So you need functions to combine models that also forces unique layer names before compiling it.


If I want to split Resnet50 at what point is that feasible. The last layer being dense 1000 has to change for dogscats to dense 2. I am not sure where and how to cut it.

In Resnet50 we have 3 layers at the bottom AveragePooling2D -> Flatten -> Dense the final layer.
In previous splits with vgg16 we cut on the Flatten layer and then add BatchNormalization etc

Or simply just remove the last dense layer and replace with Dense(2,activation=‘softmax’)

When I do this I get an error when I want to see if the layer is there using model.summary() :: tried to call xxxx but layer isn’t built.

Not sure what that means with Resnet50 model

Ok I guess I just bumped up against the Functional API

def finetune(self, batches):
    model = self.model
    for layer in model.layers: layer.trainable=False
    m = Dense(batches.nb_class, activation='softmax')(model.layers[-1].output)
    self.model = Model(model.input, m)
    self.model.compile(optimizer=RMSprop(lr=0.1), loss='categorical_crossentropy', metrics=['accuracy'])

I finally have a model compiled using the lines from this function and modifying them for application directly against the model defined in my notebook. I could not call this function from the model Resnet50 I imported. Perhaps because I only imported that by name.

I can now see in my model summary that the output is a 2way softmax called output.

(Gertjan Brouwer) #42

Hi, My name is Gertjan Brouwer and I am new to ML and AI and also to this forum.

I’m trying to learn AI by using and the Keras documentation, I am currently following this tutorial: Keras tutorial .
The first training worked fine and I good about 80% accuracy with 2 hours of training. But now I wanted to use VGG16 and expand my training to 14 classes by using the second tutorial on that page.

I tried changing the binary_crossentropy to categorical_crossentropy but that did not work, I also changed the last dense layer from 1 to 14 but I keep getting this error: valueerror error when checking model expected shape (none, 10) but got array with shape(0, 44927)) . This is the code I am currently using: gist .
I also tried changed the training_samples and validation_samples from 2 to 44927 but that gave me the same error with different shapes, I still think the problem lies with this piece of code:
train_labels = np.array( [0] * (nb_train_samples / 2) + [1] * (nb_train_samples / 2)) validation_data = np.load(open('bottleneck_features_validation.npy')) validation_labels = np.array( [0] * (nb_validation_samples / 2) + [1] * (nb_validation_samples / 2))

Another problem might be that my 14 class training data is not evenly spread, I have classes with 6000 images and also classes with only a 1000 but I figured that would work fine because it worked on the first model too.

At this point I am unsure on what to try, I am willing to even change everything and follow an other tutorial if you guys can point me to one that uses image classification with the VGG16 model.

Kind regards,

Gertjan Brouwer

(David Gutman) #43

You need to one hot encode your labels (should be a 44k x 14 array of 1s and 0s).

You can code one yourself or use OneHotEncoder from sklearn.preprocessing (create instance and use the fit_transform method).

(oskar) #44

Hi - I’m pretty new to programming in python and don’t really understand everything. Maybe someone can help :slight_smile:
When I called vgg.get_batches(…), I get the batches, but how do I know what these batches exactly are? What are all the properties of the batches (e.g how would I know that you can access filenames through the batches)?

(Corbin Albert) #45

@oskar, my recommendation would be to follow the code down the trail.

So in your jupyter notebook, after you have imported utils, make a new code line and run the following: ??get_batches

This is going to show you that it is returning a gen.flow_from_directory object. What is gen? Let’s look in the arguments to the function. Looks like it’s an image.ImageDataGenerator(). So now in your python notebook, you can type in ??image.ImageDataGenerator.flow_from_directory and run the code. You’ll find it returns a DirectoryIterator.

Well damn, what the hell is that? Time to bust out the ol’ trusty google. Searching ‘DirectoryIterator Keras’ will point you to the top link:, which is the actual method. You could have run ??image.DirectoryIterator in your jupyter notebook, but there was not an imediately easy way to know it would belong to image and not image.ImageDataGenerator or what-have-you.

If you go to line 893, you’ll find the Directory Iterator, with all the attributes you could ever hope to know about! And that is, eventually, what get_batches returns to you.

Hope this helps, let me know if anything is still unclear!

(oskar) #46

@corbin, thank you for the answer, this helps a lot.

(Tait Larson) #48

This happens when you accidentally install Keras 2 instead of Keras 1.2.