I’m working on state-farm, and vgg16BN has def get_batches(self, path, gen=image.ImageDataGenerator(), shuffle=True, batch_size=8, class_mode='categorical'): return gen.flow_from_directory(path, target_size=(224,224), class_mode=class_mode, shuffle=shuffle, batch_size=batch_size)
However, the StateFarm images are 640x480. Does Keras automatically resize or crop the images?
It ‘squishes’ them down to appropriate size. BTW you can use the imshow method living I believe on pyplot in matplotlib to take a look at the images, something like this (pseudo code so might not work ;)):
%matplotlib inline
from matplotlib import pyplot as plt
batches = get_batches(...)
batch = next(batches)
plt.imshow(batch[0][0])
I think somewhere there you also need to transpose the array you get out of batches so that channels / height / width axis are in correct order, but don’t remember this of the top off my head. All the code should be in the notebooks though and in utils.py.
I think that this can be a big issue in the fisheries competition, as the images have quite different sizes and aspect ratios. Squishing them isn’t probably good. Perhaps it would be better to re-escale them so that their larger dimension is 244 (keeping ratio constant) and then pad with zeros the rest.
I would be also interested in that.
I really feel what I’d be looking into is something like that:
find the max height, width of the batch
set it as targeted size and fill it with 0
resize it to final size (224 x 224)
This would keep the ratio while allow dynamic sizes.
Sadly I am not really sure how to integrate that with flow_from_directory
i.e.
batch size = 4
(img1 3 x 1220 x 1200 , img2 3 x 1920 x 696, img3 3 x 550 x 550)
gives us 3 x 1920 x 1200
fill all the image with zeros such as they have this dims 3 x 1920 x 1200
reshape it the way flow_from_directory do it to go to 3 x 224 x 224
There might need a different behavior if the image is smaller than 224 x 244.
On a bit of a side question - when using VGG we use 224x224 because that’s the image size it was pre-trained on and its “target size”. If I were to build a model from scratch I would have no such restraint.
…So question is: Are bigger images better? If I had 512x512 images should I keep them or still set target size to 224x224? Is there a size where the images are just too big for CNNs?
@jason lesson 7 answer how to use different sizes.
Also, as far as I understand if the image is really big it will just require different arguments (for kernel_size and stride for example) and maybe more Conv layers since the image will have “too much” data.
This will result in very slow model and that is why attention model can really help.
Bigger images are much better. We show an example in the course when we train cats v dogs on larger images, and get much better results. But you’ll also find you need smaller batch sizes (since you run out of GPU RAM) and more time (since there’s more computation). So in the end you have to make your own assessment of time vs accuracy.
It would be great it some people built models using larger images. Would be a great project for a student in fact!..