HRNet: new SOTA architecture


I just started to experiment with HRNet and my initial results look quite promising. Compared to ResNet, I have to use smaller learning rates, but it still converges as fast as ResNet.
The core idea behind HRNet is to not only use a single resolution as ResNet, but to keep multiple resolutions, compute with them in parallel and share the information (fusion) across the different resolutions.
This architecture seems to be powerful enough that it can be used for classification and image segmentation while the only difference can be found in the last convolutions/computations.


Explanation by one of the authors (starts at around 8:00):

Classification, Object Detection, Semantic Segmentation, …

Pose Estimation



This is awesome. I was actually looking for the code for this today. I am currently training with HRNet and ResNet on the ade20k dataset for semantic segmentation. Thanks for posting the links, I will be reading more about this.