How does keras resize images?

I’m working on state-farm, and vgg16BN has
def get_batches(self, path, gen=image.ImageDataGenerator(), shuffle=True, batch_size=8, class_mode='categorical'): return gen.flow_from_directory(path, target_size=(224,224), class_mode=class_mode, shuffle=shuffle, batch_size=batch_size)
However, the StateFarm images are 640x480. Does Keras automatically resize or crop the images?

Thanks!

3 Likes

It’s the target size argument in the flow_from_directory:

target_size=(224,224)

1 Like

Yes thanks, but how does Keras resize/crop the images? Does it just shrink it, or does it crop out the middle 224x224?

Edit: looks like it’s using PIL under the hood to compute a resize ratio.

It ‘squishes’ them down to appropriate size. BTW you can use the imshow method living I believe on pyplot in matplotlib to take a look at the images, something like this (pseudo code so might not work ;)):

%matplotlib inline
from matplotlib import pyplot as plt

batches = get_batches(...)
batch = next(batches)
plt.imshow(batch[0][0])

I think somewhere there you also need to transpose the array you get out of batches so that channels / height / width axis are in correct order, but don’t remember this of the top off my head. All the code should be in the notebooks though and in utils.py.

6 Likes

Thanks!

I think that this can be a big issue in the fisheries competition, as the images have quite different sizes and aspect ratios. Squishing them isn’t probably good. Perhaps it would be better to re-escale them so that their larger dimension is 244 (keeping ratio constant) and then pad with zeros the rest.

1 Like

Yes. I’m getting consistently better results by using the 360x640 image sizes for the fisheries competition.

1 Like

Look at the load_img function here:

Looks like it’s using PIL.Image.resize()

http://pillow.readthedocs.io/en/3.1.x/reference/Image.html

I would be also interested in that.
I really feel what I’d be looking into is something like that:

  1. find the max height, width of the batch
  2. set it as targeted size and fill it with 0
  3. resize it to final size (224 x 224)
    This would keep the ratio while allow dynamic sizes.
    Sadly I am not really sure how to integrate that with flow_from_directory
    i.e.
    batch size = 4
    (img1 3 x 1220 x 1200 , img2 3 x 1920 x 696, img3 3 x 550 x 550)
  4. gives us 3 x 1920 x 1200
  5. fill all the image with zeros such as they have this dims 3 x 1920 x 1200
  6. reshape it the way flow_from_directory do it to go to 3 x 224 x 224

There might need a different behavior if the image is smaller than 224 x 244.

On a bit of a side question - when using VGG we use 224x224 because that’s the image size it was pre-trained on and its “target size”. If I were to build a model from scratch I would have no such restraint.

…So question is: Are bigger images better? If I had 512x512 images should I keep them or still set target size to 224x224? Is there a size where the images are just too big for CNNs?

@jason lesson 7 answer how to use different sizes.
Also, as far as I understand if the image is really big it will just require different arguments (for kernel_size and stride for example) and maybe more Conv layers since the image will have “too much” data.
This will result in very slow model and that is why attention model can really help.

1 Like

Bigger images are much better. We show an example in the course when we train cats v dogs on larger images, and get much better results. But you’ll also find you need smaller batch sizes (since you run out of GPU RAM) and more time (since there’s more computation). So in the end you have to make your own assessment of time vs accuracy.

It would be great it some people built models using larger images. Would be a great project for a student in fact!..

3 Likes