Just curious if this is possible given that we may want to fine tune some other model in the future that was trained without batch normalization, but without the convenience of being able to go back and retrain it against the original dataset it was built on.
My question is more along the lines of, “What do you do if you if you want to add batch norm into a model that didn’t use it … and you don’t have access to the original training set to retrain the entire thing with batch norm on?”
I don’t think you’re able to add it after. Jeremy eludes to this in lesson 3
He does say it might be possible, but that he hasn’t worked it out yet.
pop some layers, and add batch norm , and train the new layers.
I am also interested in this:
Particularly in adding dropout and batchnorm into the convolution layers as there are good reasons from a signal processing perspective as to why this would be advantageous.
I think the vgg_bn model only has batch norm in the dense layers and not in the convolutional layers
How much compute do we need to retrain the Convolutional component of the VGG network on imagenet?
If you have to train the convolutional component, then you have to train the entire model which will take about a week. Check the time taken to train the original VggNet, for idea on the timeframe