The latest fastai blog post is all about training Imagenet in 18 minutes. One of the things that helped them get to the target accuracy faster was, as long my understanding goes, training with the original aspect ratio of the image instead of a squared 224x224 resized image.
Unfortunately I couldn’t understand what one should do to train with the original aspect ration in fastai by just reading the code. Have someone done one of those incredible medium articles explaining the idea step by step? Or a video? This would be really helpful.
They only validated with rectangular aspect ratios, not trained (that’s for fastai_v1 ).
The idea is that usually, during validation, people take a center crop of the rectangular image to feed it inside the network. By doing that, you lose crucial information (might be taking a dog’s head for instance). The network in itself doesn’t care about getting square or rectangular images because of the average pooling layer at the end (as Jeremy explained in length during the MOOK). What it does care about is having a batch of images with the same size, but one batch could be 128 images of 200 by 300 and the next 128 images of 400 by 200 pixels.
It’s harder to do it with training since we want to shuffle, but for validation, they just grouped images that had very close ratios and batched them together. By doing this, the model had an easier time validating, which translated into needing less epochs to train to full accuracy.