Segmentation mask prediction on different input image sizes

Hello, I successfully trained a segmentation UNet model (based on unet_leaner with resnet34 architecture) and exported it with learn.export method.

When I load the saved model and run predictions it works as expected for the images which match the training images dimension. But if I’m running predictions on any different image size, my predicted segmentation mask does not match input image size. I receive predicted mask resized to the original training dimensions.

I guess it is linked to the transforms I’m using during training.
I tried to remove default crop_pad transform for valid_ds, but predicted mask always resized.

Could you help me with settings for my learner before export to have predictions of the same size as input image?

Input image size:
torch.Size([1184, 1184]

Predicted mask size:
torch.Size([128, 128])

Desired mask size:
torch.Size([1184, 1184]

7 Likes

You need either need to create a new DataBunch with the desired size then load your model (with learn.load) or hack into where the size is store if you export your Learner, which is in learn.data.train_ds.tfmargs (and same with valid_ds, test_ds if applicable). At inference, I think it uses the transforms of single_dl so you might have to do

learn.data.single_dl.dataset.tfmargs['size'] = bla

but not entirely sure.

4 Likes

Thanks a lot @sgugger, change learn.data.single_ds.tfmargs['size'] = None to predict one image per time worked like a charm!

7 Likes

Let me know if I should make a new thread (or open a GitHub issue for this) – I was using this technique successfully until I stumbled across an image with one dimension that was odd (e.g., 341 x 512). I then got an error (traceback is below) noting that Sizes of tensors must match except in dimension 1. Got 341 and 342 in dimension 2. I thought that there was some intrinsic problem, but in this case it seems that the problem stems from the dimension not being even. When I resized the image to be of dimension (342 x 512) instead of (341 x 512), the problem was resolved.

This is something I can guard against (currently writing up boundary checks), but I wonder if the solution best belongs in fastai (which is presumably the source of the 342 dimension in the first place)?

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-56-537da95efe79> in <module>
----> 1 ll.predict(example_image)

/opt/anaconda3/lib/python3.7/site-packages/fastai/basic_train.py in predict(self, item, return_x, batch_first, with_dropout, **kwargs)
    369         "Return predicted class, label and probabilities for `item`."
    370         batch = self.data.one_item(item)
--> 371         res = self.pred_batch(batch=batch, with_dropout=with_dropout)
    372         raw_pred,x = grab_idx(res,0,batch_first=batch_first),batch[0]
    373         norm = getattr(self.data,'norm',False)

/opt/anaconda3/lib/python3.7/site-packages/fastai/basic_train.py in pred_batch(self, ds_type, batch, reconstruct, with_dropout, activ)
    348         activ = ifnone(activ, _loss_func2activ(self.loss_func))
    349         with torch.no_grad():
--> 350             if not with_dropout: preds = loss_batch(self.model.eval(), xb, yb, cb_handler=cb_handler)
    351             else: preds = loss_batch(self.model.eval().apply(self.apply_dropout), xb, yb, cb_handler=cb_handler)
    352             res = activ(preds[0])

/opt/anaconda3/lib/python3.7/site-packages/fastai/basic_train.py in loss_batch(model, xb, yb, loss_func, opt, cb_handler)
     24     if not is_listy(xb): xb = [xb]
     25     if not is_listy(yb): yb = [yb]
---> 26     out = model(*xb)
     27     out = cb_handler.on_loss_begin(out)
     28 

/opt/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    539             result = self._slow_forward(*input, **kwargs)
    540         else:
--> 541             result = self.forward(*input, **kwargs)
    542         for hook in self._forward_hooks.values():
    543             hook_result = hook(self, input, result)

/opt/anaconda3/lib/python3.7/site-packages/fastai/layers.py in forward(self, x)
    134         for l in self.layers:
    135             res.orig = x
--> 136             nres = l(res)
    137             # We have to remove res.orig to avoid hanging refs and therefore memory leaks
    138             res.orig = None

/opt/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    539             result = self._slow_forward(*input, **kwargs)
    540         else:
--> 541             result = self.forward(*input, **kwargs)
    542         for hook in self._forward_hooks.values():
    543             hook_result = hook(self, input, result)

/opt/anaconda3/lib/python3.7/site-packages/fastai/layers.py in forward(self, x)
    148     "Merge a shortcut with the result of the module by adding them or concatenating thme if `dense=True`."
    149     def __init__(self, dense:bool=False): self.dense=dense
--> 150     def forward(self, x): return torch.cat([x,x.orig], dim=1) if self.dense else (x+x.orig)
    151 
    152 def res_block(nf, dense:bool=False, norm_type:Optional[NormType]=NormType.Batch, bottle:bool=False, **conv_kwargs):

RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 341 and 342 in dimension 2 at /opt/conda/conda-bld/pytorch_1573049306803/work/aten/src/THC/generic/THCTensorMath.cu:71

Unet in v1 doesn’t work with odd sizes. This is known and will be fixed in v2.

1 Like

For what it’s worth, I wrote a function for fastai.Images to pad them out by 1 px if one of their dimensions is odd. In my testing, this seems to be an OK workaround for now to permit me to use the Unet in v1 with odd sizes:

def fix_odd_sides(example_image):
    if (list(example_image.size)[0] % 2) != 0:
        example_image = crop_pad(example_image, 
                            size=(list(example_image.size)[0]+1, list(example_image.size)[1]),
                            padding_mode = 'reflection')

    if (list(example_image.size)[1] % 2) != 0:
        example_image = crop_pad(example_image, 
                            size=(list(example_image.size)[0], list(example_image.size)[1] + 1),
                            padding_mode = 'reflection')

As of today, this is no longer working because the attribute learn.data does not exists with the latest fastai. I have looked over the forums and can’t find any reference to this.

@sgugger can you advise on how to handle this in the new fastai?

1 Like

@asoellinger Did you find a solution?

Short answer, I can’t remember.
Here’s what I did

1 Like

Hi,
I have the same question für models trained with unet_learner on resnet34
Can anyone point me in the right direction?

Thanks!