Beyond Global Average Pooling - How to pre-processing each channel across locations

wbrucek · October 27, 2017, 4:47pm

Newer Vision models, like Resnet, use global average pooling at the end before their dense layers.

Rather than a single global average pooling for each channel, I’d like to do a single 7x7 convolution of each channel (separately). This convolution can then figure out if the average/max/other pooling is best.

I know how to do a 7x7 convolution of all channels together (resulting in a huge number of weights). How to do this with each channel separately?

davecg · October 28, 2017, 12:03am

Depending on what you mean precisely, you might want to look into depthwise separable convolutions.

There’s an implementation in Keras as part of the MobileNet application.

wbrucek · October 28, 2017, 3:41am

Awesome - thanks!

machinethink · October 28, 2017, 12:57pm

Just to clarify: a depthwise separable convolution is the combination of a depthwise convolution followed by a piecewise (or 1x1) convolution. What you’re looking for is just the first part, depthwise convolution, not the 1x1 convolution that follows it.