Here is an example from fish.ipynb notebook (inside dl1 directory). In the case of the this Kaggle competition the important part of the image is a fish. The fish were often not in the middle of the image so the cropped image missed the most important information.
@yinterian could you clarify if when we set sz that the default resize function is a center crop? And to avoid this we can just follow your provided code passing in crop_type=CropType.NO? Or is this only for augmentation/transforms?
I have a related question on sz parameter to tfms_from_model. Can we provide a Height x Width like (400 x 250) size or does it need to be a Square. Digging into the code little bit, it looks like it only expects one value for sz parameter and looks to be resizing to the square image?
Anyways we can keep size input as int or a tuple of h, w? or was that found to be not that useful?
Based on what @moody mentioned, when cropping is used to convert the images to a square shape and data augmentation is used, does 1) cropping happen first and then augmentation or 2) data augmentation of the original image and then cropping? Seems like option 2 may retain more data, but option 1 is more computationally efficient
No new object is getting created. But on each Epoch, the Dataset Loader will apply transforms with a random parameter (like zoom, sheer, shift etc) on each image to create slightly modified version of the input image so that network doesn’t over-train to the input images.
It’s easy enough to do it in the transforms - the issue is that the model itself needs to have a consistent input size. I guess it doesn’t really have to be square - just consistent. But in practice generally we have a mix of landscape and portrait orientation, which means square is the best compromise.
If you have a dataset that’s consistently of a particular orientation, then perhaps it does indeed make sense to use a rectangular input - in which case feel free to provide a PR which allows that (i.e. sz everywhere it’s used would assume square if it’s an int, or rectangle if a tuple).
We specify 3 learning rates for different layer groups and not for one layer. Different layer groups need different amount of fine tuning and hence different learning rates. Before unfreezing, we were only training the last layer and we only needed to supply one learning rate. After unfreezing, if we supply only learning rate, fastai library will use the same learning rate for all the layer groups and this may not be ideal.