Advanced image data augmentation with albumentation & fastai

Just wanted to share my code for adding advanced data augmentation into fastai training pipeline. I really wonder if there is a more elegant way to do it?

Libraries like albumentations or imgaug require PIL like(or CV2 converted to RGB) image array. But fastai augmentations work with tensor. So we have to convert the data back and forth. Also, there is some wrapping required by fastai.

import albumentations as A

def tensor2np(x):
    np_image = x.cpu().permute(1, 2, 0).numpy()
    np_image = (np_image * 255).astype(np.uint8)
    
    return np_image

def alb_tfm2fastai(alb_tfm):
    def _alb_transformer(x):
        # tensor to numpy
        np_image = tensor2np(x)

        # apply albumentations
        transformed = alb_tfm(image=np_image)['image']

        # back to tensor
        tensor_image = pil2tensor(transformed, np.float32)
        tensor_image.div_(255)

        return tensor_image

    transformer = TfmPixel(_alb_transformer)
    
    return transformer()

You can use this wrapper method with any augmentations set like this:

tfms = alb_tfm2fastai(A.HueSaturationValue())

And then pass it to get_transforms as xtra_tfms or to transform as the only set of transforms.

I hope someone will find it useful!

12 Likes

Great Serhiy !! This is really help full.

By the way do you have example for segmentation kind of problems where augmentation is to be applied for x (image) and y (mask) both?

1 Like

That’s neat! Last time I used albumentations with fastai, I created a data bunch using my custom Dataset class. Will play with this approach soon!
Thanks for sharing :blush:

1 Like

@sayakgis The albumentations library does support this. In the code you will have to replace

alb_tfm(image=np_image)

with

alb_tfm(image=np_image, mask=???)

So the question is, how do you get the mask? I can see that fastai’s method transform() has tfm_y argument. Maybe if you set it as True you will get the mask passed to your transform method. But I am not sure, never tried that.

@serhiy
thanku for this…
if were to apply pipeline using compose then how can one do

A.Compose([
        A.OneOf([
            A.ShiftScaleRotate(shift_limit=0.05, scale_limit=0.1,
                               rotate_limit=15,
                               border_mode=cv2.BORDER_CONSTANT, value=0),
            A.OpticalDistortion(distort_limit=0.11, shift_limit=0.15,
                                border_mode=cv2.BORDER_CONSTANT,
                                value=0),
            A.NoOp()
        ]),
        ZeroTopAndBottom(p=0.3)]

@serhiy
how about the normalization part here… i see you are multiplying only 255 to image . I presume tfms will send the normalized image to albu2tfms

@champs.jaideep yes, we need to scale the data to 0…255 format before sending to alb_tfm. And later in the code I will scale it back to 0…1:

@champs.jaideep regarding your question, you can use the pipeline for your augmentations like this:

tfms = alb_tfm2fastai(A.Compose([       
    # all your augmentations here
])
1 Like

before this what i did was. I used it open method and then doing interconversions between PIL and np.

Is using above method a better way or using yours one .