Difficult to use U-Net segmentation models for large images

I went through the fastai material for U-Net segmentation and it works great for small images (like 200x200 resolution).

However, it seems to break down when you’re segmenting larger images (at least 1000x1000 resolution).

What I’ve found so far is that even with the progressive resizing and using the lower resolution model weights to initialize the next higher resolution model, consecutive training is more and more difficult. The only way I’ve worked around this, is to lower the learning rate, as very large images loss will blow up even at 1e-3 learning rates, even though the learning rate finder suggests it should be at that value.

Even with all these tricks, the higher resolution model converges to a higher loss value, and when using it at inference time it underperforms the lower resolution model.

Has anyone run into this issues, and are there any tricks on segmenting high resolution images?

It may be due to the lowering of the batch size which makes it more unstable. You could try using AccumulateScheduler or GradientClipping

1 Like

Thanks. I’ve been looking into AccumulateScheduler, and they mention that it’s necessary to change the loss function as well:


It looks like you get a shape mismatch error with this method. U-Net default loss_func is “FlattenedLoss of CrossEntropyLoss()”, but I can’t seem to find any mention of this loss in the source code. There is a Flattened Loss class, but “CrossEntropyFlat” already calls the FlattenedLoss class.

I’m confused as to what’s the difference between the documentation recommended CrossEntropyFlat() loss function modification vs the U-Net default “FlattenedLoss of CrossEntropyLoss()”

I think it is this one: https://github.com/fastai/fastai/blob/8013797e05f0ae0d771d60ecf7cf524da591503c/fastai/layers.py#L245

Whats the exact error?

Looks like I got confused between pytorch and fastai code. On further looking the fastai loss is calling pytorch loss, and the documentation gets called is from pytorch (not fastai). I’m trying to figure out why setting loss_func=CrossEntropyFlat() directly causes the error, and I’m trying to track down where in the U-Net code the default loss_func gets called because it doesn’t give a mismatch error…

I’m getting this error:

~/miniconda3/envs/in_container/lib/python3.6/site-packages/torch/nn/functional.py in nll_loss(input, target, weight, size_average, ignore_index, reduce, reduction)
   1820     if input.size(0) != target.size(0):
   1821         raise ValueError('Expected input batch_size ({}) to match target batch_size ({}).'
-> 1822                          .format(input.size(0), target.size(0)))
   1823     if dim == 2:
   1824         ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)

ValueError: Expected input batch_size (5400) to match target batch_size (518400).

Target input is correct, since my output segmentation mask is 480 x 270 pixels with batch size of 4. I’m trying to figure out where the input size of 5400 is coming from…

So I can’t figure out why setting loss_func=CrossEntropyFlat() is causing the shape mismatch error above, while creating U-Net with the default of loss_func which is also CrossEntropyFlat() does not cause the error.

But I could do learner.loss_func.reduction = “sum” to switch to sum mode, and now instead of 0.0x losses, I’m getting 10k to 100k losses. Is this supposed to be the case?

@ai_padawan to answer your loss function question, this could be due to it being CrossEntropyFlat(axis=1), not 0 (which is the default). See here:

1 Like

How does this work for other things such as GAN? What part of the code would need to be changed to not resize and have the training not throw an error regarding it expecting a smaller image? Thanks!

Did you manage to make it work for large images?
I have the same problem.