Autoencoder- upscaling encoded output to arbitrary size

bushindo · June 9, 2021, 4:57am

Hi there,

I’m practicing with building an autoencoder, and I have an encoder that uses xresnet which takes the input (a 425x425 image) and returns a 512-dimensional vector.

The problem is that I need to upscale this 512-dimensional vector to a 425x425 array again. I saw some examples which uses PixelShuffle_ICNR layers to upscale a given tensor, but the layer can only upscale by some integer multiples (scale = 2, scale = 3, etc.)

So 425 has prime factors of 17, 5, and 5, so I could upscale using those factors. Is there an easier way to upscale to an arbitrary size in fastai?

darek.kleczek · June 9, 2021, 9:22am

Hi, can you say more what you’re trying to achieve? 512-dim vector is orthogonal to the picture dimensions (425x425) so upscaling it doesn’t seem straightforward to me…

bushindo · June 9, 2021, 4:32pm

Sorry I wasn’t clear. The situation is as follows.

Let’s say my batch size is 256. My input data (black and white images) are of size 256x1x425x425

After I pass the input into the encoder, the encoded output is 256x512x1x1

I need to build a decoder that maps the 256x512x1x1 encoded data back to 256x1x425x425.

If I call PixelShuffle_ICNR with scale=2 plus some ConvLayer, I can change the encoded output 256x512x1x1 to 256x256x2x2, and calling it again will produce 256x128x4x4.

My question is that since PixelShuffle_ICNR can only upscale by some integer constants, how do I upscale to an arbitrary ouptut size (in this case, it’s 256x1x425x425, but I wanted to know about the general case or difficult cases like where the image size is a prime number)