Order of transformations

I’m having problems to cut holes on the images in a grid.

I have imported the images with size 224 as follows:

data = ImageDataBunch.from_folder(path = path,train="train" , valid = "validation", ds_tfms=tfms, size=224, num_workers=16, bs = 64).normalize(imagenet_stats)

I want to cut square holes of size 56x56 (basically divide my 224 size image in 16 squares and randomly cut some of them).
image
But as we can see in the red highlighted partS, some cuts are not fitting the 56x56 square size.
Also, since i cut the holes with probability 50%, about half of the image should be cut, but it appears that more than half is always being cut.

I suppose this is happening because the cuts are being made before the image is resized to 224 size (originally the images are from variate size).
If that’s the case, I probably would want to change the order of the transformation, right?

 cutoutgrids = TfmPixel(_cutoutgrids, order=20) 

I tried to change it (order = 1, 20, 30, 50 ,…) but none of them solved the problem. I thought all transformations were done after the resizing.

Here’s the transformation I defined in case you want to check:

def _cutoutgrids(x, patch_size:uniform_int=32, p_cut = 0.5):
    "Divides the image in a grid and cut some parts"
    h,w = x.shape[1:]
    for i in range(int(w/patch_size)):
        for j in range(int(h/patch_size)):
            if np.random.uniform() < p_cut:
                y1 = j*patch_size 
                y2 = y1 + patch_size 
                x1 = i*patch_size
                x2 = x1 + patch_size
                x[0, y1:y2, x1:x2] = 0.540202
                x[1, y1:y2, x1:x2] = 0.544406
                x[2, y1:y2, x1:x2] = 0.514351
    return x

cutoutgrids = TfmPixel(_cutoutgrids, order=20)

tfms = [cutoutgrids(patch_size=(56,56))]
tfms = [tfms, []]

Well I guess you can try order=9999 but if that still doesn’t work, it probably means the problem comes from somewhere else. Your function looks fine to me, so you’re probably right, your image gets cropped after your transform has been applied. Does data.tfms contain anything else than your transform ? Because if that’s not the case it actually doesn’t make sense that this problem is occurring.

I’ve solved the problem by changing size from 224 to (224,224).
I remember from one of Jeremy’s class that when we give an integer to size it reseizes the images as square images of this size, but I must have got it wrong…

The difference lies here:

default_rsz = ResizeMethod.SQUISH if (size is not None and is_listy(size)) else ResizeMethod.CROP

If you pass a tuple it will squish the image, while if you pass a single integer, it will crop it. However, resize is still applied first in theory, so it doesn’t explains why it wouldn’t work. The secret of your problem probably lies in this part of the code (from fastai.vision.image.Image.apply_tfms):

        size_tfms = [o for o in tfms if isinstance(o.tfm,TfmCrop)]
        for tfm in tfms:
            if tfm.tfm in xtra: x = tfm(x, **xtra[tfm.tfm])
            elif tfm in size_tfms:
                if resize_method in (ResizeMethod.CROP,ResizeMethod.PAD):
                    x = tfm(x, size=_get_crop_target(size,mult=mult), padding_mode=padding_mode)
            else: x = tfm(x)

I don’t exactly know what x = tfm(x, size=_get_crop_target(size,mult=mult), padding_mode=padding_mode) does but it may be problematic.

1 Like

It’s all a result of the ResizeMethod used:

  • passing size=224 makes the resize method default to crop as you pointed out, in which case we resize our picture to one of the same aspect ratio so that the smallest dim is 224 then crop randomly on the other dimension (which is why we lose parts of those squares)
  • passing size=(224,224) makes the resize method default to squish, in which case we resize our picture to (224,224) (which squishes a bit if the picture was rectangular) and we don’t lose part of the image.

Note that the right way to do it should not be to pass size=(224,224) but size=224 along with resize_method=ResizeMethod.SQUISH

2 Likes

I’m not sure I understand the difference between passing size=(224,224) and size=224 along with resize_method=ResizeMethod.SQUISH .

Won’t the final result be the same?

Thanks @sgugger and @florobax

Yes the result will be the same, just saying it’s the way it should be done since it’s clearer why you get this result.

2 Likes

M i missing some thing here… i thought i should see all list of tfm m passing up…