I worked through lesson’s 4 statefarm notebook and had an issue with in pre-computing the output of the last convolutional layer, I’m trying to understand if it’s something I did wrong or is there a mistake in the notebook.
In the notebook the batches that are used in conv_model are with shuffle=True (default), and when I used the conv_feat (the output of the conv_model) as an input to the bn_model I got training accuracy and validation accuracy of ~10%.
I tried playing with the learning rate and some other things that didn’t help which make me think that the labels are not corresponding correctly.
I changed batches to shuffle=False, re-run everything and got ~60% training accuracy after 1 epoch.
So, is there a mistake in the notebook and the batches must not be shuffled when pre-computing an output of intermediate level? or did I miss something?
When precomputing outputs you should not shuffle the input of your conv layers because that would mess up the alignment between your labels and conv-output.
The 10% you see is actually “Random” output (as I believe there are 10 classes?)
A similar thing goes for data augmentation, you can only do data augmentation if you save the corresponding labels in the correct order, so that each conv-output corresponds to a correct label.
I guess you did not miss a thing!
Please correct me if I did though…
It probably will be a good idea if someone who can edit the notebook for statefarm will make this adjustment:
add: batches = get_batches(path+'train', batch_size=batch_size, shuffle=False)
in a cell after:
ln [15]: conv_model = Sequential(conv_layers)