I am trying to work with a solution for the UltraMNIST Classification Challenge in kaggle by using the fastai library.
I used a starter notebook which contains the original RGB images with resolution 4000x4000 and the same images preprocessed at 500x500 resolution.
When I load the first 1000 images at 500x500 resolution in the dataloaders and train a few epochs, the average training time is about 34 sec/epoch.
I tried to load the original images (4000x4000) and apply a resize transformation of 500x500, both in item_tfms and batch_tfms. The training time has been increased dramatically to an average ~14:30 min/epoch (~25-fold increase!!)
Is it because the validation set (it’s only 200 images) is at the original resolution?
If that is the bottleneck, could we apply only the resize transformation to the validation set somehow?
Is it something else happening in these transformations behind the scenes, that I am missing out?
For both cases I used exactly the same code for data loading and training, shown below:
img_size= 500 bs=40 splitter = RandomSplitter(0.2) item_tfms=[Resize(size=img_size, resamples=(Image.Resampling.LANCZOS, 0))] filters=[ ImageFilter.EDGE_ENHANCE_MORE, ImageFilter.EMBOSS, ImageFilter.CONTOUR, ImageFilter.FIND_EDGES, ] xtra_tfms = [Dihedral()] batch_tfms=aug_transforms(do_flip=False, max_rotate=0.0, max_zoom=1.0, p_affine=0.0, p_lighting=0.0, xtra_tfms=xtra_tfms, mode='bilinear', pad_mode='reflection', align_corners=True, mult=1.0, flip_vert=False, min_zoom=1.,max_lighting=0.0, max_warp=0.0, size=img_size, batch=False, min_scale=1.) db = DataBlock(blocks=(ImageBlock, CategoryBlock), get_x=Pipeline([ColReader(0, pref=pref, suff=suff), ApplyPILFilter(filters, p=0.5)]), get_y=ColReader(1), splitter=splitter, item_tfms=item_tfms, batch_tfms=batch_tfms) dls = db.dataloaders(df, bs=bs) learn = vision_learner(dls, densenet121, metrics=accuracy, model_dir="/kaggle/working").to_fp16() learn.fine_tune(100, 0.002, cbs = [cb1, cb2])