Rescaling an image when resolution does not meet size criteria (e.g 224)

Again, been following the code to see how a specific problem is solved: i.e. what do you do with images when the input size is not the exact prescribed dimensions (e.g. 224 x 224, 229 x 229 etc)

I think this code is at the kernal of resizing:

def scale_min(im, targ):
    """ Scales the image so that the smallest axis is of size targ.

        im (array): image
        targ (int): target size
    r,c = im.size
    ratio = targ/min(r,c)
    sz = (scale_to(r, ratio, targ), scale_to(c, ratio, targ))
    return im.resize(sz, Image.BILINEAR)

I could also have sworn that in a separate question, @jeremy mentioned that it is useless to add pixels when the image dimensions are below prescribed format. I thought that at this point, the internal code would just pad the original image with zeros. From the above, that is likely not so.

So, just wanted to ask what the best approach would be… Not sure if padding is ideal, but maybe it is? Haven’t ever experimented with that, so I do apologize, but in deep-learning there are just so many avenues for experimentation that sometimes it’s a little unfeasible.


1 Like

There’s no padding here. Try running that code to see what it does - you’ll see it’s making the smallest axis equal to target. Then later on we’ll crop out the center or a random crop.

Yeap…took me an hour of intense code meditation to convince myself that the center-crop happens…

Which brings me to another question: what is the standard image input size expected for Resnet34? I mean, either it’d have to be 224, or it’d have to be 299. I suppose it couldn’t be both… and since ( I believe ) there should be one standard dim., how would we otherwise be able to use the pretrained model for both these input dimensions (224 and 299) ? In other words, wouldn’t it be that the Resnet34 model would be pretrained on imagenet ingesting either 224 or 299 image sizes?

Unfortunately, after doing more research on the code-base, I couldn’t figure that out. I even checked out the ResNet module class. Doesn’t seem to specify any expected input size.

P.S (20 min. later): mmm just looked at code some more… still can’t figure it out. Sorry if I seem ignorant. To the contrary, this is not for the lack of trying (so far :slight_smile: )

I appreciate your close study! There’s no single image size supported by the architectures. They can all handle any size. In lesson 2 we saw how we used different sizes for the same model, in fact, switching in the middle of training!


@jeremy,according to your opinion,could I use the parameter “ftms” in tfms_from_model() to instead his code “scale_min”?