In transforms.py, crop is applied after rescale

gdc · January 30, 2018, 12:04am

Hi @jeremy
First , since it’s the first time I ping you, thanks for the great course . the generated abstracts in lesson 4v2 are absolutely mind blowing ! I can’t wait to try by myself.

for now I’m still working on image classification, on kaggle’s IEEE camera recognition and I think I face a bug in fastai.
I want to use a random crop on the images but without downscaling to keep native resolution (get 299*299 out of the picture, even if it’s a small part). I found out that the crop is applied after the scaling due to this code in transforms.py:

self.tfms = tfms + [crop_tfm, normalizer, channel_dim]

github.com

fastai/fastai/blob/fd4f5cf803d12ac56e17a2ee9e777a76efc2dbc1/fastai/transforms.py#L441


CENTER = 2
NO = 3




class Transforms():
def __init__(self, sz, tfms, normalizer, denorm, crop_type=CropType.CENTER, tfm_y=TfmType.NO):
    self.sz,self.denorm = sz,denorm
    crop_tfm = CenterCrop(sz, tfm_y)
    if crop_type == CropType.RANDOM: crop_tfm = RandomCrop(sz, tfm_y)
    if crop_type == CropType.NO: crop_tfm = NoCrop(sz, tfm_y)
    self.tfms = tfms + [crop_tfm, normalizer, channel_dim]
def __call__(self, im, y=None): return compose(im, y, self.tfms)




def image_gen(normalizer, denorm, sz, tfms=None, max_zoom=None, pad=0, crop_type=None, tfm_y=None):
if tfm_y is None: tfm_y=TfmType.NO
if tfms is None: tfms=[]
elif not isinstance(tfms, collections.Iterable): tfms=[tfms]
scale = [RandomScale(sz, max_zoom, tfm_y=tfm_y) if max_zoom is not None else Scale(sz, tfm_y)]
if pad: scale.append(AddPadding(pad))
if (max_zoom is not None or pad!=0) and crop_type is None: crop_type = CropType.RANDOM

so I end up with rescaled images and augmented images which are quite close from one another.
in my case I patched this to self.tfms = [crop_tfm] + tfms + [normalizer, channel_dim]
but that would break the intuitive default behavior for crop_type=CENTER, for most image recognition problems I guess.

I think the clean way to fix this is to add an optional crop_size parameter that would be either an int (nb of pixels to keep) or a float ( % of the image to keep ) on top of the current crop_type. and apply the crop before tfms.

What do you think?
If you agree with the approach, I can submit a PR.

gdc · January 31, 2018, 7:41am

looking at this again it looks like the crop feature is more intended to fit the model input shape than data augmentation, am I right ? so, instead of modifying it, I should have added another random crop transformation, placed first in tfms_aug?

or maybe even pre-cropped the jpeg files offline - but i’d lose the randomness -, since right now i have a CPU bottleneck on opening these big jpeg files just to discard 90%+ of the pixels.

Ashka · May 11, 2018, 8:51pm

Wait does it mean that the images are cropped instead of being resized (like downsized)?

gdc · May 11, 2018, 10:35pm

yes, the images are resized. which I guess is the right behaviour for a vast majority of image recognition tasks.

my problem was that in this particular case i didn’t want the rescaling, i just wanted a crop (even if it meant discarding 90% of the image) which was not possible since the images were first resized so that the smallest dimension is 299px, and then cropped randomly.

uulwake · November 6, 2019, 9:22am

Hi @gdc, I have the same exact problem with you. Did you find any solutions?

gdc · November 6, 2019, 10:52am

Hi!
At the time, I just kept a separate clone of fastai lib for the Kaggle comp, with the patch mentioned above (forcing crop_tfm to be done before tfms) but I guess this is unlikely to still be relevant today, I think fastai lib was rewritten almost entirely with v1?