VGG Average Pooling

In vgg16_avg.py @jeremy is using weights trained with max pooling and replaces the max pooling layers with average pooling.
I might be missing something, but wouldn’t the weights after the average pooling have lower mean (as compared to after max pooling) and throw off the rest of the network? Shouldn’t we retrain the weights using the whole of imagenet as Jeremy did when introducing Batch Norm, or the weights are close enough that it works out.

1 Like

Ideally, yes - although the relative weights are still OK, so it works well enough for this task.

1 Like

Thanks, this clears it!