How sz and max_zoom in tfms_from_model work?

When we set up the data for training, we do:

tfms_from_model(model, sz, aug_tfms=transforms_top_down, max_zoom=1.05)

I am confused about the parameters sz and max_zoom. I think sz is the input size of the model. Then image from whatever size will be resized to sz . (Image will be cropped to square form).

For max_zoom, I suppose the image will be zoomming by this parameter than will be resized to sz after ? Is my thinking right ?

Does the resize process have high computational complexity, I suppose it need a lot of interpolation right ?

Finally, I am trying to read from the source code, but because my programmation skill is so limited (I mostly write small script and never deal with a big software ). Anyone can propose me a technique to read efficiently the sourcecode and also improve my programmation skill ?

Thank you so much

So, from what i see in code, max_zoom parameter controls the scaling size.
If it is set, the image will be resized to the random size between sz and sz*max_zoom with 75% probability (by default). After that it will be cropped to rectange (with size sz). Take a look at RandomScale class in transforms.py.

Not sure about computational complexity, resizing of the images is done by opencv library, i guess it has some optimizations.

With respect to reading the source code, i don’t think i can suggest you something other than practice and text editor which allows you to quickly jump to the function declaration. Also, to get familiar with the style that Jeremy uses, you should read this https://github.com/fastai/fastai/blob/master/docs/abbr.md

Hope that helps.

1 Like

Thank you @bny6613, that really helps.

I was thinking that images are resized to sz before coming to model but with your explanation it is wrong. Images instead will be cropped to sz

About reading source code, I will read the style you said. I am using Sourcegraph to track related classes that many people in this Forum mentioned and it helps a lot. I am trying to test little piece of code to better understand with Jupyter notebook. Do you think using others IDE for example PyCharm will help understand easier the big picture of the software or just Jupyter is ok?

Reading big professional source code always make me feel overwhelmed then exhausted and quit after. This time I will be patient.

i would suggest vscode for code analysis.

i use jupyter for experiments and running the code. But at the same time I have vscode open with source, so I can dig in and/or modify if needed.

1 Like

I use exactly the same approach :slight_smile:

I want to clarify, that it will be resized AND cropped. It first resizes it, so the smallest side = sz and then crop to rectangle.

If you run this:

trn_tfms, val_tfms = tfms_from_model(arch, sz)
val_tfms.tfms

you will see all the transforms that will be applied in the same order

[<fastai.transforms.Scale at 0x11fc8a1feb8>,
 <fastai.transforms.CenterCrop at 0x11fc8a1ff60>,
 <fastai.transforms.Normalize at 0x11fc89892b0>,
 <fastai.transforms.ChannelOrder at 0x11fc8a52ac8>]

You can also apply them yourself

img = open_image('path_to_your_image.jpg')
transform_index = 0 # Only scale
transformed_image = val_tfms.tfms[transform_index](img,None)[0]
plt.imshow(transformed_image);

transform_index = 1 # Crop
transformed_image = val_tfms.tfms[transform_index](transformed_image,None)[0]
plt.imshow(transformed_image);
1 Like

Thank you so much @urmas.pitsi, @bny6613 for your help. I will try to use vscode.

About the sz parameter. Thanks for the very clear demonstration @bny6613 . You really help me how to test that code. I’ve tried quickly but not suceed.