Guidance for writing custom pooling scheme?

I would like to try doing a ‘hybrid’ pooling on my dataset of 2D images, in which we do max pooling along the rows, and average pooling along the columns.

How would you recommend I go about doing this?

It seems that fastai-v2 is still using PyTorch, so maybe I should post this on the PyTorch boards instead…?

Briefly, I see that PyTorch’s 1D pooling operations expect, essentially, 1D data, whereas I’d like to do apply 1D max pooling operations along each row of the image, followed by a 1D average pool on the remaining columns. (Or vice versa. I know these operations won’t commute, but I don’t care about the order right now).

(The idea of writing a loop in which I successively slice along each row and pass it to MaxPooling1D strikes me as prohibitively slow, even if it were parallelized among multiple workers.)


(The motivation for this is that for my type of images, the classifier should be translation-equivariant horizontally, but not so much vertically. Right now I just do 2D max pooling and works fine. I’m just curious to try out my idea to see if it works even better… it probably won’t, haha, but one can hope.)

I will probably make a fool of myself since it’s late and I’ve had a glass of wine. But here goes…

If you just want the simple max of each row, this can be done with a max over the column dimension. Then average over the single resulting column.

If you need the complexity of padding, dilation, stride, etc., I bet you could flatten the rows and channels together, use maxpool1d over the single row with rows*channels channels, and then unflatten (reshape) the result. No loops.

Please let me know if you can make anything real out of this idea. :exploding_head:

1 Like

Thanks. Your answer motivated me to realize that I mis-wrote what it would need to be, so I may go back up and edit the post:
It’s not a single max for each row, but a series of max’es for, say, every two pixels along each row.
I like your idea of not having to worry about full generality of paddings & strides (that could come later). Keeping it limited and simple would work for now! :slight_smile: