Lesson 7 discussion

Please use this thread to ask questions about or discuss lesson 7. The wiki page, including links to video, notebooks, and resources, is here: http://wiki.fast.ai/index.php/Lesson_7


with the new vgg16.py, I was getting an error in the fish.ipynb:

from vgg16 import Vgg16
model = vgg_ft(8)
Exception: Input 0 is incompatible with layer dense_1: expected ndim=2, found ndim=4

after some debugging I figured that the new Vgg16.create() method returns the output of the vonv lared directly without adding a Flatten(), so the tensor has 4 dimensions (?, 3, 228, 228) instead of 2 what is the input_spec expected by the Dense() layer.

I inserted a Flatten() layer before the line that adds the Dense() layer in Vgg16.ft() and now is working for me (you can argue that the right place to add it is in the Vgg16.create() method, before the return, but I didn’t try that yet). I paste my fix below for those who came across the same issue:

def ft(self, num):
    model = self.model
    for layer in model.layers: layer.trainable=False
    model.add(Dense(num, activation='softmax'))
1 Like

Please forgive me if this turns out to be a silly oversight on my part, but I can’t seem to locate the ipynb that was shown in Lesson 7 here.

This is the notebook that had the ResNet demonstration in it.

I don’t see the ResNet stuff in either the Lesson7.ipynb or the fish.ipynb…

Am I totally missing something obvious, or is there are 3rd notebook that isn’t up yet? Thanks!

Can you show (maybe in a gist) the code that you’re using to create the data that you’re calling vgg_ft() with?

I included resnet50.py on the wiki, but not the notebook that called it - since it’s a pretty trivial addition to the lesson3 notebook so I figured it would be best for people to try creating it themselves. Let us know if you have any issues with this…

so the Global Average Pooling definition, etc, lives on video only? I’ll have to sit down tomorrow and really check out all the pieces and parts and see what lives where. :smiley:

(Just curious, why is this part of lesson 7 an extension of Lesson 3?)

this is the gist

GlobalAveragePooling2D is part of Keras https://keras.io/layers/pooling/#globalaveragepooling2d . There’s nothing only on video other than my simple little networks (the most effective of which is just 3 lines of code).

I used cats and dogs for showing resnet since it’s a good fit for that dataset - I mention why in the lesson.

1 Like

Oops my dumb fault. @rachel just hit that same bug. I just uploaded a fixed Vgg16.py . Thanks!

I have tried Resnet conv + simple FC and Resnet conv + Global_avg_pool on cats and dogs and fisheries. My results are worse than VGG with Batch norm, did any one else give it a try?

Also, I did store the conv features and fed it as input to FC / Global_avg_pooling dense models.

Edit: I realized that I was not using shuffle=False for training batch. Fixing that improved Redux score a lot, but still not much improvement in fisheries. This is probably as expected, as resnet was trained on ImageNet. Feel like retraining last few conv layers might help? I am going to try that next.

Can you comment on the architecture of a CNN that does not output a probability, but a number?
An example would be finding the head and tail pixels of the fishes. This would be 2 coordinates or 4 numbers.
I see how one could use vgg with Dense(4) as the final output, but… is a special activation used or no activation at all?

Additionally, I understand why the input is normalized, but why does the training output need to be normalized [-1,1]. Or does it?

In the “Basic VGG” section of this lesson.
A model is first created using: model = vgg_ft_bn(8)
The layers of this model are set to be untrainable upon creation. The top layers are removed and a dense top layer is added for the classification. This model is then trained on the data. After which the conv layers are used to predict features for the train data.
I don’t understand the point of actually training this model. Wouldn’t it be just as good to create the original vgg16 model minus the top layer and use that to predict the features?


In lesson7.ipynb in the Pseudo-labeling section, the first cell has the predictions call:

preds = model.predict([conv_test_feat, test_sizes], batch_size=batch_size*2)

but the test_sizes variable is not defined in the notebook. Can anyone help with this?

Thank you!

To clarify it seems as though pseudo labeling was tried with the model that uses additional “meta-data” in regards to the image size. They created a model with multiple input layers - specifically sz_inp is the input for size. If you want to use this model for pseudo labeling then you have to run the code as well as add this code:

raw_test_sizes = [PIL.Image.open(path+'test/'+f).size for f in val_filenames]
test_sizes = to_categorical([size2id[o] for o in raw_test_sizes], len(id2size))
test_sizes = test_sizes-trn_sizes_orig.mean(axis=0)/trn_sizes_orig.std(axis=0)

More likely your model is not using the meta data and therefore it should be as simple as changing the code to this:

preds = model.predict(conv_test_feat, batch_size=batch_size*2)

Thanks very much!

In lesson7.ipynb in the Pseudo-labeling section

test_batches = gen.flow(conv_test_feat, preds, batch_size=16)

but gen.flow only allows channels 1,3 and 4 on axis=1 so how do we use features precomputed from VGG Conv Layers here, getting this error:

ValueError: NumpyArrayIterator is set to use the dimension ordering convention "th" (channels on axis 1), i.e. expected either 1, 3 or 4 channels on axis 1. However, it was passed an array with shape (19662L, 512L, 14L, 14L) (512 channels).
1 Like


I’m going through the notebook, and trying to make some submissions. however, my kaggle leaderboard loss is very different from my validation logloss. e.g, I’m now at the bounding-boxes section, and I get validation-loss of 0.12, and accuracy of 0.98. when I submit to kaggle, I got loss of 1.12. I’ve made a spreadsheet in which I tried to get my real accuracy from the lb logloss, and i got (with 0.82 threshold) that my leaderboard accuracy is around 0.73.
do my calcuation seem correct? if so, why is the difference?

@shgidi this from the kaggle discussion forum gives you some hints as to why:

also look at:

Thanks! my partner pointed this out as well… however I don’t fully understand this, becasue it seems the test set is similar to the training. anyway, I think I’ll try to split the ships between validation and training…

I get the error:

ValueError: Error when checking : expected lambda_input_1 to have shape (None, 3, 224, 224) but got array with shape (500, 512, 22, 40)

In the last line of the pseudo labeling section:

preds = model.predict(conv_test_feat, batch_size=batch_size*2)
gen = image.ImageDataGenerator()
test_batches = gen.flow(conv_test_feat, preds, batch_size=16)
val_batches = gen.flow(conv_val_feat, val_labels, batch_size=4)
batches = gen.flow(conv_feat, trn_labels, batch_size=44)
mi = MixIterator([batches, test_batches, val_batches])
bn_model.fit_generator(mi, mi.N, nb_epoch=8, validation_data=(conv_val_feat, val_labels))

I had to modify the first line from this:
preds = model.predict([conv_test_feat, test_sizes], batch_size=batch_size*2)

to this: (as suggested previously in the thread)
preds = model.predict(conv_test_feat, batch_size=batch_size*2)

I am stumped as to what could be going wrong.