Training time of pre-sized images vs resizing on the fly with transformations

Hi,
I am trying to work with a solution for the UltraMNIST Classification Challenge in kaggle by using the fastai library.
I used a starter notebook which contains the original RGB images with resolution 4000x4000 and the same images preprocessed at 500x500 resolution.

When I load the first 1000 images at 500x500 resolution in the dataloaders and train a few epochs, the average training time is about 34 sec/epoch.

I tried to load the original images (4000x4000) and apply a resize transformation of 500x500, both in item_tfms and batch_tfms. The training time has been increased dramatically to an average ~14:30 min/epoch (~25-fold increase!!)

Is it because the validation set (it’s only 200 images) is at the original resolution?
If that is the bottleneck, could we apply only the resize transformation to the validation set somehow?
Is it something else happening in these transformations behind the scenes, that I am missing out?

For both cases I used exactly the same code for data loading and training, shown below:

img_size= 500
bs=40
splitter = RandomSplitter(0.2)
item_tfms=[Resize(size=img_size, resamples=(Image.Resampling.LANCZOS, 0))]
filters=[
    ImageFilter.EDGE_ENHANCE_MORE, 
    ImageFilter.EMBOSS,
    ImageFilter.CONTOUR,
    ImageFilter.FIND_EDGES,
]
xtra_tfms = [Dihedral()]

batch_tfms=aug_transforms(do_flip=False, max_rotate=0.0, max_zoom=1.0,
                        p_affine=0.0, p_lighting=0.0, xtra_tfms=xtra_tfms, 
                        mode='bilinear', pad_mode='reflection', align_corners=True,
                        mult=1.0, flip_vert=False, min_zoom=1.,max_lighting=0.0, 
                        max_warp=0.0, size=img_size,
                        batch=False, min_scale=1.)

db = DataBlock(blocks=(ImageBlock, CategoryBlock),
                       get_x=Pipeline([ColReader(0, pref=pref, suff=suff),
                                                 ApplyPILFilter(filters, p=0.5)]),
                       get_y=ColReader(1),
                       splitter=splitter,
                       item_tfms=item_tfms,
                       batch_tfms=batch_tfms)
dls = db.dataloaders(df, bs=bs)

learn = vision_learner(dls, densenet121, metrics=accuracy, model_dir="/kaggle/working").to_fp16()

learn.fine_tune(100, 0.002, cbs = [cb1, cb2])

A 4k by 4k image will do that, it takes a large chunk of time to load that in. Try it by timing how long it takes to load a single one in with img = PILImage.create(fname) vs your 500x500. And then take it further by timing how long it takes to crop said image. E.g.:

%%timeit
_ = PILImage.create(fname)
im = PILImage.create(fname)
--- In another cell ---
%%timeit
_ = Resize(500)(im)
1 Like

Oh I see. The bottleneck is at the loading and transformation of the images. It gives a ~35x increase for the creation of the image and about 200x(!!) for the resizing.
So the only way forward is to make the resizing I want beforehand and save it to disc (I want to try different resolutions and their impact on accuracy)
Thanks a lot for the quick response and solution to this issue!

1 Like

@theros88 theres a new param to a fastai function to help you do just this!

*edited based on Jeremy’s clarification

1 Like

Not a new function – but the functionality to replicate the source folder structure when recurse=True is new.

2 Likes

That was exactly what I needed. The workers of resize_images really do speed up the process. Thank you both very much!