What to do if raw image is very large(CPU bottleneck)

My raw data sets are images round 4000 x 3000 resolution. This raise Dataloader bottleneck as CPU usage is always 100% trying to resize image(my item resize setting is 512*512). My current approach is to pre-resize the image to 512 x n or n x 512 according to height:width ratio.

Is this approach appropriate? Still doing some trials to see how the loss of information affects the metrics performance.

Resizing the image and saving the resized version on your hard drive is the best approach to speed up your training.
If you feel like you are losing performance with small image size, try progressive resizing. For example try 224 x 224 px first, then 448 x 448 then 672 x 672 etc. This will give you better results than starting at a high resolution.

2 Likes

Thanks for the suggestion, but I am not so sure about how to implement progressive resizing. Do I need to change size in item_tfms or batch_tfms? Does the following code do the trick?

def get_dls(size, bs):

    dls = ImageDataLoaders.from_df(df,
                                   path=path,
                                   fn_col='path',
                                   valid_col='is_val',
                                   label_col='target',
                                   y_block=CategoryBlock,
                                   item_tfms=Resize(512, method=ResizeMethod.Squish),
                                   batch_tfms=aug_transforms(size=size),
                                   bs=bs)

    return dls


dls = get_dls(256, 128)
dls = get_dls(384, 64)
dls = get_dls(512, 32)

IIRC most people when dealing with this tend to split the images up, so the loss in compression and cropping is “lossless”. Usually this is done on segmentation, not classification, but I imagine you could take a similar approach.

Progressive resizing wouldn’t be bad either. There is a chapter in the fastai book that discusses it

3 Likes