Pre-computing the output of the last convolutional layer


(David Kagan) #1

I worked through lesson’s 4 statefarm notebook and had an issue with in pre-computing the output of the last convolutional layer, I’m trying to understand if it’s something I did wrong or is there a mistake in the notebook.
In the notebook the batches that are used in conv_model are with shuffle=True (default), and when I used the conv_feat (the output of the conv_model) as an input to the bn_model I got training accuracy and validation accuracy of ~10%.

I tried playing with the learning rate and some other things that didn’t help which make me think that the labels are not corresponding correctly.
I changed batches to shuffle=False, re-run everything and got ~60% training accuracy after 1 epoch.

So, is there a mistake in the notebook and the batches must not be shuffled when pre-computing an output of intermediate level? or did I miss something?


(Matthijs Jansen) #2

Hey @Kagan,

When precomputing outputs you should not shuffle the input of your conv layers because that would mess up the alignment between your labels and conv-output.
The 10% you see is actually “Random” output (as I believe there are 10 classes?)

A similar thing goes for data augmentation, you can only do data augmentation if you save the corresponding labels in the correct order, so that each conv-output corresponds to a correct label.

I guess you did not miss a thing!
Please correct me if I did though…

grts


(David Kagan) #3

Thanks for your replay @MPJ.

It probably will be a good idea if someone who can edit the notebook for statefarm will make this adjustment:
add:
batches = get_batches(path+'train', batch_size=batch_size, shuffle=False)

in a cell after:
ln [15]: conv_model = Sequential(conv_layers)


(Matthijs Jansen) #4

Hi @Kagan

No problem, you could file a pull request with the github repository to get this fixed!