Fancy U-Nets

I’ve been working on some segmentation projects. In trying to maximize performance, I’ve looked into some different U-Net architectures. Here are two I found interesting.

https://openreview.net/pdf?id=SyQtAooiz
This paper replaces pooling and upsampling with pixel shuffle like operations and uses a CycleGan-like framework for training the model. I tried training a model with a pretrained resnet backbone followed by upsampling using pixel shuffle. It performed approximately the same as the transpose conv model used in lesson 14. At some point I want to try train a full model from scratch using a pixel shuffle type operation to downsample rather than convolutions with a stride of 2, but I haven’t gotten around to that yet.


This paper replaces pooling and upsampling with wavelet wizardry that I’m still trying to wrap my head around.

3 Likes

Thanks for the share, I will try to apply these to my problem and report back

1 Like

Can you share your implementation?

@Karl, you can also take a look at this github repo:

With this code, you can easily instantiate different types of encoder-decoder segmentation architectures with or without imagenet pretrained weights and with different cnn backbone encoders. Very useful in many situations to find optimal solution.

1 Like

The down-sampling operation which is like pixel shuffle is actually posted on the pytorch forums: https://github.com/pytorch/pytorch/issues/2456

They call it “pixel un-shuffe”