Fancy U-Nets

I’ve been working on some segmentation projects. In trying to maximize performance, I’ve looked into some different U-Net architectures. Here are two I found interesting.
This paper replaces pooling and upsampling with pixel shuffle like operations and uses a CycleGan-like framework for training the model. I tried training a model with a pretrained resnet backbone followed by upsampling using pixel shuffle. It performed approximately the same as the transpose conv model used in lesson 14. At some point I want to try train a full model from scratch using a pixel shuffle type operation to downsample rather than convolutions with a stride of 2, but I haven’t gotten around to that yet.

This paper replaces pooling and upsampling with wavelet wizardry that I’m still trying to wrap my head around.


Thanks for the share, I will try to apply these to my problem and report back

1 Like

Can you share your implementation?

@Karl, you can also take a look at this github repo:

With this code, you can easily instantiate different types of encoder-decoder segmentation architectures with or without imagenet pretrained weights and with different cnn backbone encoders. Very useful in many situations to find optimal solution.

1 Like

The down-sampling operation which is like pixel shuffle is actually posted on the pytorch forums:

They call it “pixel un-shuffe”