CNN training on multi-resolution dataset

Dear All,

I am trying to learn some computer vision by implementing a solution to the following Kaggle’s competition using Fast.AI:


This is some medical/biological object detection problem. For now I’m trying to deploy a simple baseline model with ResNet-34 based segmentation U-Net architecture (of size 256x256) and post-processing. I am getting 0.83 intersection over union measure for segmentation (without post-processing) which is not satisfactory :smiley: I’ve been training for around 60 epochs starting with 128x128 size and then training further in full size.

Unfortunately, my data is not coherent resolution-wise and I am getting those funny holes and little dots in my predictions explicated below (ground truth mask, prediction, data consequently). On most 256x256 (standard size in the dataset) images with small objects I am much closer to the fully correct solution without abnormalities as below. Am I right that sizes and resolutions cause this problem? Some local signals seem to be propagated too far along the network? How can I deal with this problem?

Thanks a lot!