Error when resizing predicted mask

Hi all,

I have trained a model for the camvid dataset (as part of lesson 3) which I exported to my Google Drive.
Then I loaded it with learn.load and executed the following:
pred_mask, lbl, probs = learn.predict(img).

The size of pred_mask is (720, 960), but my img has size (720, 1280).

Now when I try to overlay my image with the predicted mask, it does not fit.
I tried to achieve the overlay with the code from the fast ai docs:
_,axs = plt.subplots(1,3, figsize=(8,4))[0], title='no mask')[1], y=pred_mask, title='masked')[2], title='mask only', alpha=1.)

The results that I get is this:

But now I want to resize the mask, so I tried to do this:

After doing so, I got this error: RuntimeError: grid_sampler(): expected input and grid to have same dtype, but input has long and grid has float

So I thought maybe I am doing this wrong and I should apply a transforms, so I did:
tfms = get_transforms()
pred_mask.apply_tfms(tfms[0],size=(1, img.size[0], img.size[1]))

This gives the same error…

The only related post I found was this:

So any thoughts on what I am doing wrong, or how to solve this?


For the ones looking for an answer to this after me:
I haven’t found the solution yet to the problem described in my post here.
But someone gave me a tip that I shouldn’t resize the mask itself, but rather the input img.

So I resized the img, next I passed the resized image to the predict method and that gave me a good mask which I could overlay as described in the docs.

I am wondering though, are we even able to resize the mask?


I’m experiencing exactly the same issue with you, and haven’t found the way to resize the mask as well.

In my opinion, I think resizing the mask shall be more flexible, as we want to make the prediction more generalizable to different image sizes.

OK, it seems I found the source of the problem and the solution: by casting the dataType of the ImageSegmentation from long to float!

Try add the following line before calling the pred_mask:

pred_mask = ImageSegment(

And then:
pred_mask.apply_tfms(tfms[0],size=(1, img.size[0], img.size[1]),resize_method=ResizeMethod.SQUISH)

image segment data should be stored internally as floats and the resampling methods can be found here

even though returns a long tensor, it stores a float tensor in px , so I need to call ImageSegment on a float tensor instead of a long .

1 Like