To clarify it seems as though pseudo labeling was tried with the model that uses additional “meta-data” in regards to the image size. They created a model with multiple input layers - specifically sz_inp is the input for size. If you want to use this model for pseudo labeling then you have to run the code as well as add this code:
raw_test_sizes = [PIL.Image.open(path+'test/'+f).size for f in val_filenames]
test_sizes = to_categorical([size2id[o] for o in raw_test_sizes], len(id2size))
test_sizes = test_sizes-trn_sizes_orig.mean(axis=0)/trn_sizes_orig.std(axis=0)
More likely your model is not using the meta data and therefore it should be as simple as changing the code to this:
but gen.flow only allows channels 1,3 and 4 on axis=1 so how do we use features precomputed from VGG Conv Layers here, getting this error:
ValueError: NumpyArrayIterator is set to use the dimension ordering convention "th" (channels on axis 1), i.e. expected either 1, 3 or 4 channels on axis 1. However, it was passed an array with shape (19662L, 512L, 14L, 14L) (512 channels).
I’m going through the notebook, and trying to make some submissions. however, my kaggle leaderboard loss is very different from my validation logloss. e.g, I’m now at the bounding-boxes section, and I get validation-loss of 0.12, and accuracy of 0.98. when I submit to kaggle, I got loss of 1.12. I’ve made a spreadsheet in which I tried to get my real accuracy from the lb logloss, and i got (with 0.82 threshold) that my leaderboard accuracy is around 0.73.
do my calcuation seem correct? if so, why is the difference?
Thanks! my partner pointed this out as well… however I don’t fully understand this, becasue it seems the test set is similar to the training. anyway, I think I’ll try to split the ships between validation and training…
I put together notes on lesson 7 with a bunch of questions inline Would be great if you all can edit it, if you understand those aspects better: http://wiki.fast.ai/index.php/Lesson-7-notes
For those of you using resnet, which layers are you making trainable? You’ll need to be careful to choose a set of layers that makes sense given the architecture of resnet!
I’ve been looking at Lesson 7 and, in particular, generating the output from the convolution layers and I am confused by the fact that you have many different files for the weights.
If I look at vgg16.py, when the model is built we load the weights ‘vgg16.h5’ while in vgg16bn.py, if there is no top you we load ‘vgg16_bn_conv.h5’ and with the top we load ‘vgg16_bn.h5’. I can understand that ‘vgg16_bn.h5’ should be different from ‘vgg16.h5’ for the dense layers due to the addition of batch normalization but since the convolution layers don’t change, I would expect ‘vgg16_bn_conv.h5’ to be identical to ‘vgg16.h5’ without the top.
Also in lesson 7, to get the output of the convolution layers, you are using the VGG16BN model but before splitting off the convolution layers, the model is trained for 3 epochs and the weights are saved. These weights are reloaded just before the splitting but given that in the fine tuning process all of the layers are frozen except for the decision layer that would seem to me to be a step that would have no effect. Can you explain why you reloaded the weights?
Basically, my question is if you are only interested in getting the output of the convolution layers, whether you use the original VGG16 model or the VGG16BN model should not matter shouldn’t it? Are they not identical up until we get to the fully connected layers?
Has anyone been experiencing the same issue that I have – in that I need to use Keras flow_from_directory() instead of flow() and fit_generator() instead of fit() – because I am using my local machine (old win7 machine with 6GB memory) and can’t load all the fisheries images into memory?
I would really like to specify numpy arrays in memory (those arrays for the metadata inputs, the bounding boxes, and the test predictions) for use with flow_from_directory() and fit_generator() data in those 2 methods. It is impacting the ability do multi-input, multi-output, and pseudo-labeling with MixIterator (the last one because the DirectoryIterator object generated by flow_from_directory does not support indexing). You cannot specify your own ‘y’ with flow_from_directory() like you can with flow() for the pseudo-labeling.
Am I missing something? Is it possible to get those 2 methods to work with arrays in memory as additional inputs to the images in the directories? Do I need to write a lot of new code to make it work? Or should I just forget it and add more memory to my machine?
@jeremy your lessons are AMAZING. I couldn’t find a way to start a new thread, so please allow me to ask you a question here: what do I do if I want to classify something other than an image?
What really got me excited about Deep Learning from your first lesson was that you said “it’s completely false when people say deep learning requires a large data set.” I am starting to wonder if that’s only true because we have pre-trained models based on large datasets like ImageNet, and we can achieve great results by fine tuning it, if the classification problem falls under a similar category of the pre-trained model.
Let’s say I want to use deep learning to implement a speech recognition model for a very limited set of situations. From your experience, do you think I will need at least thousands of audio recordings to be able to come up with a viable model? More generally, how do I go about building a model from scratch when I can’t rely on a pre-trained model?
I am having exactly the same problem and I had to resort to a different procedure altogether without using the Mixiterator.
In the notebook, it looks like the mixiterator worked for Jeremy and he was able to fetch the precomputed batches even if they are not technically images with 3 channels. Can anyone shed any light on that?