Lesson 7 discussion



In lesson7.ipynb in the Pseudo-labeling section, the first cell has the predictions call:

preds = model.predict([conv_test_feat, test_sizes], batch_size=batch_size*2)

but the test_sizes variable is not defined in the notebook. Can anyone help with this?

Thank you!

(Sean Lanning) #15

To clarify it seems as though pseudo labeling was tried with the model that uses additional “meta-data” in regards to the image size. They created a model with multiple input layers - specifically sz_inp is the input for size. If you want to use this model for pseudo labeling then you have to run the code as well as add this code:

raw_test_sizes = [PIL.Image.open(path+'test/'+f).size for f in val_filenames]
test_sizes = to_categorical([size2id[o] for o in raw_test_sizes], len(id2size))
test_sizes = test_sizes-trn_sizes_orig.mean(axis=0)/trn_sizes_orig.std(axis=0)

More likely your model is not using the meta data and therefore it should be as simple as changing the code to this:

preds = model.predict(conv_test_feat, batch_size=batch_size*2)


Thanks very much!

(Chintan) #17

In lesson7.ipynb in the Pseudo-labeling section

test_batches = gen.flow(conv_test_feat, preds, batch_size=16)

but gen.flow only allows channels 1,3 and 4 on axis=1 so how do we use features precomputed from VGG Conv Layers here, getting this error:

ValueError: NumpyArrayIterator is set to use the dimension ordering convention "th" (channels on axis 1), i.e. expected either 1, 3 or 4 channels on axis 1. However, it was passed an array with shape (19662L, 512L, 14L, 14L) (512 channels).

(Gidi Shperber) #18


I’m going through the notebook, and trying to make some submissions. however, my kaggle leaderboard loss is very different from my validation logloss. e.g, I’m now at the bounding-boxes section, and I get validation-loss of 0.12, and accuracy of 0.98. when I submit to kaggle, I got loss of 1.12. I’ve made a spreadsheet in which I tried to get my real accuracy from the lb logloss, and i got (with 0.82 threshold) that my leaderboard accuracy is around 0.73.
do my calcuation seem correct? if so, why is the difference?

(Angel) #19

@shgidi this from the kaggle discussion forum gives you some hints as to why:

also look at:

(Gidi Shperber) #20

Thanks! my partner pointed this out as well… however I don’t fully understand this, becasue it seems the test set is similar to the training. anyway, I think I’ll try to split the ships between validation and training…

(Bob) #21

I get the error:

ValueError: Error when checking : expected lambda_input_1 to have shape (None, 3, 224, 224) but got array with shape (500, 512, 22, 40)

In the last line of the pseudo labeling section:

preds = model.predict(conv_test_feat, batch_size=batch_size*2)
gen = image.ImageDataGenerator()
test_batches = gen.flow(conv_test_feat, preds, batch_size=16)
val_batches = gen.flow(conv_val_feat, val_labels, batch_size=4)
batches = gen.flow(conv_feat, trn_labels, batch_size=44)
mi = MixIterator([batches, test_batches, val_batches])
bn_model.fit_generator(mi, mi.N, nb_epoch=8, validation_data=(conv_val_feat, val_labels))

I had to modify the first line from this:
preds = model.predict([conv_test_feat, test_sizes], batch_size=batch_size*2)

to this: (as suggested previously in the thread)
preds = model.predict(conv_test_feat, batch_size=batch_size*2)

I am stumped as to what could be going wrong.

(sravya8) #22

I put together notes on lesson 7 with a bunch of questions inline :slight_smile: Would be great if you all can edit it, if you understand those aspects better: http://wiki.fast.ai/index.php/Lesson-7-notes

(Karthik Kannan) #23

Hey Sravya,

The former is what I was facing initially too. My performance with Resnet even after adding shuffle=False remains worse than VGG + batch norm.

   batches = get_batches(path+'train',shuffle=False,batch_size=64)
   val_batches = get_batches(path+'valid',shuffle=False,batch_size=64)
   test_batches = get_batches(test_path,shuffle=False,batch_size=64)

Any pointers to what might be going wrong? I’m clipping my predictions at 0.05 and 0.975.

(Jeremy Howard) #24

For those of you using resnet, which layers are you making trainable? You’ll need to be careful to choose a set of layers that makes sense given the architecture of resnet!

(Jeremy Howard) #25

@sravya8 note that @bckenstler has finished the lesson 7 notes now: http://wiki.fast.ai/index.php/Lesson_7_Notes

Also, here’s the data leakage paper - http://www.cs.umb.edu/~ding/history/470_670_fall_2011/papers/cs670_Tran_PreferredPaper_LeakingInDataMining.pdf . (Brad perhaps you could link to that from the notes?)

(Norman Secord) #26

I’ve been looking at Lesson 7 and, in particular, generating the output from the convolution layers and I am confused by the fact that you have many different files for the weights.

If I look at vgg16.py, when the model is built we load the weights ‘vgg16.h5’ while in vgg16bn.py, if there is no top you we load ‘vgg16_bn_conv.h5’ and with the top we load ‘vgg16_bn.h5’. I can understand that ‘vgg16_bn.h5’ should be different from ‘vgg16.h5’ for the dense layers due to the addition of batch normalization but since the convolution layers don’t change, I would expect ‘vgg16_bn_conv.h5’ to be identical to ‘vgg16.h5’ without the top.

Also in lesson 7, to get the output of the convolution layers, you are using the VGG16BN model but before splitting off the convolution layers, the model is trained for 3 epochs and the weights are saved. These weights are reloaded just before the splitting but given that in the fine tuning process all of the layers are frozen except for the decision layer that would seem to me to be a step that would have no effect. Can you explain why you reloaded the weights?

Basically, my question is if you are only interested in getting the output of the convolution layers, whether you use the original VGG16 model or the VGG16BN model should not matter shouldn’t it? Are they not identical up until we get to the fully connected layers?

(Jeremy Howard) #27

Not quite - the BN version was fine-tuned so the conv layers will have different scaling to take advantage of the BN.

(Karthik Kannan) #28

Hey Chintan,

I’m facing the same issue. Did you have a chance to fix it?


(Christina Young) #29

Has anyone been experiencing the same issue that I have – in that I need to use Keras flow_from_directory() instead of flow() and fit_generator() instead of fit() – because I am using my local machine (old win7 machine with 6GB memory) and can’t load all the fisheries images into memory?

I would really like to specify numpy arrays in memory (those arrays for the metadata inputs, the bounding boxes, and the test predictions) for use with flow_from_directory() and fit_generator() data in those 2 methods. It is impacting the ability do multi-input, multi-output, and pseudo-labeling with MixIterator (the last one because the DirectoryIterator object generated by flow_from_directory does not support indexing). You cannot specify your own ‘y’ with flow_from_directory() like you can with flow() for the pseudo-labeling.

Am I missing something? Is it possible to get those 2 methods to work with arrays in memory as additional inputs to the images in the directories? Do I need to write a lot of new code to make it work? Or should I just forget it and add more memory to my machine? :wink:

Thanks, Christina

(Ravi Teja Gutta) #30

Hi all, can someone explain to me why the image size inputs were input to BatchNorm layer if it’s already normalized in notebook 7 ?

(Howon Song) #31

@jeremy your lessons are AMAZING. I couldn’t find a way to start a new thread, so please allow me to ask you a question here: what do I do if I want to classify something other than an image?

What really got me excited about Deep Learning from your first lesson was that you said “it’s completely false when people say deep learning requires a large data set.” I am starting to wonder if that’s only true because we have pre-trained models based on large datasets like ImageNet, and we can achieve great results by fine tuning it, if the classification problem falls under a similar category of the pre-trained model.

Let’s say I want to use deep learning to implement a speech recognition model for a very limited set of situations. From your experience, do you think I will need at least thousands of audio recordings to be able to come up with a viable model? More generally, how do I go about building a model from scratch when I can’t rely on a pre-trained model?

(Romano) #32

I am having exactly the same problem and I had to resort to a different procedure altogether without using the Mixiterator.

In the notebook, it looks like the mixiterator worked for Jeremy and he was able to fetch the precomputed batches even if they are not technically images with 3 channels. Can anyone shed any light on that?


Can you explain tour method/ show the code ? I am stuck here and with much bigger test-stg2 dataset, pseudo labeling would help much