Adding a custom transform to the image data pipeline

I’d like to apply a custom transformation to image data and train a network to undo it. I plan to use a given image as its own output, and apply my transformation on the input side.

One ugly way to implement this would be to subclass ImageItemList for the input and override .get, applying my transformation to before returning it. I’d rather learn to do this kind of thing the right way.

It seems easy enough to make a function into a transform (mytfm = Transform(func)), but now how does the transform get in the queue? I tried calling get_transforms, appending my transform to each list, and then passing those lists to an ImagedataBunch instantiation method (from_folder), but .apply_tfms(mytfm) fails whenever I try to work with a datapoint, due to the expectation that mytfm has a .tfm attribute, which suggests im.apply_tfms expects a RandTransform or some other similar wrapper as input (even though the argument tfms is typed as TfmList which includes callables).

I also tried calling DataBunch.add_tfm with the same result – it wants a Transform wrapped so that it has a .tfm attribute.

The next step would be to wrap my function in a RandTransform, but this still seems wrong, since my transformation is deterministic and not designed for data augmentation, it’s pre-processing for the input data.

Another fix would be to create a generic wrapper with the required attribute, but it seems strange that such a thing doesn’t already exist? Maybe I’m overlooking it, but I would have expected RandTransform to be a subclass of the generic wrapper class.

I’m also looking at all the .process methods in data_block, but I think these use instances of PreProcessor, which as I understand get applied at the ImageItemList.items level, which in my case is an array of Path objects - the Image object only gets loaded when used.

What is the usual way of working with a custom transform?


Just write your custom function then use one of the TfmSomething class. For instance, flip_lr is just:

def _flip_lr(x):
    "Flip `x` horizontally."
    return x.flip(2)
flip_lr = TfmPixel(_flip_lr)

It’s import to have two different names for the function and the transform (otherwise it won’t pickle correctly), but it’s very easy to add your own transform.


This is one of the things I tried. It fails when the data is accessed in any way with:

AttributeError: 'TfmPixel' object has no attribute 'tfm'

I’ve figured out that (with flip_lr defined as above, im an Image) instead of im.apply_tfms(flip_lr), flip_lr needs to be called with no arguments, like im.apply_tfms(flip_lr()) as calling flip_lr() is the signal to Transform to return itself wrapped in a RandTransform. I’m not familiar with this convention, where does it come from?

However, if db is an ImageDataBunch, db.add_tfm(flip_lr()) is still not correct, as attempts to use the dataloaders fail with something like AttributeError: 'tuple' object has no attribute 'something'… where ‘something’ depends on what I casted my function to, e.g. TfmPixel(_flip_lr) results in ‘pixel’, TfmCoord(_flip_lr) results in ‘coord’, etc.

But what does work is appending flip_lr() to the lists of transforms when you create the ImageDataBunch. E.g.:

_tfms = get_transforms()
tfms = [_tfms[0]+[flip_lr()], _tfms[1]+[flip_lr()] ]
db = ImageDataBunch.from_folder(path, ds_tfms=tfms, size=sz)

So far at least, I get the behavior I’m expecting from the ImageDataBunch.


What is the easiest way to do ONLY one transform (for example, flip_lr and nothing else)? I’ve been using get_transforms() and trying to set all the parameters to prevent all but one, but it seems like there must be an easier way.

Just pass your custom lists (it needs to be a list of two lists of transforms, one for training and one for validation). [[flip_lr()],[]] should work AFAICT.


That worked, thanks. The part I was missing was the second list for validation.

I’ve been trying to make an augmentation function that adds text, but I’m running into what I think are pickling issues. The function:

def _textify(x):
    val = np.random.random_sample()

    if val > 0.85:
        pil_img = PIL.Image.fromarray(image2np(x*255).astype('uint8'))

        w, h = pil_img.size
        text_loc = (random.randint(0,w//2),random.randint(0,h//2))
        text_color = (random.randint(0,255), random.randint(0,255), random.randint(0,255))
        text_length = random.randint(0,100)
        text = ''.join([random.sample(string.printable, 1)[0]+random.sample(string.ascii_letters, 1)[0]
                                        for i in range(text_length)])
        text_size = random.randint(3,50)
        font = ImageFont.truetype("arial.ttf", text_size)
        ImageDraw.Draw(pil_img).text(text_loc, text, fill=text_color, font=font)

        x = pil2tensor(pil_img,np.float32)

return x

Wrap in transform:

textify = TfmPixel(_textify)

Create transform list:

tfms = get_transforms()
new_tfms = (tfms[0] + [textify()], tfms[1])

This list of transforms applies to single images just fine. I can call


On any standard Image object and get the desired results.

When I try to put this into a dataloader:

new_data = (src.label_from_func(lambda x: path_hr/
   .transform(new_tfms, size=size, tfm_y=True)
   .databunch(bs=bs).normalize(imagenet_stats, do_y=True))

And grab a batch:

x, y = next(iter(new_data.train_dl))

The notebook hangs. In the terminal window, this error repeats:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\GATEWAY\Anaconda2\envs\fastai_v1\lib\multiprocessing\", line 105, in spawn_main
    exitcode = _main(fd)
  File "C:\Users\GATEWAY\Anaconda2\envs\fastai_v1\lib\multiprocessing\", line 115, in _main
    self = reduction.pickle.load(from_parent)
AttributeError: Can't get attribute '_textify' on <module '__main__' (built-in)>

Does anyone know what I should do about this?

For multiprocessing on Windows, the functions are pickled to be sent to the different processes (that’s on the pytorch side). You get an error when it tries to pickle your new transform, which is weird.

  1. You can pass num_workers=0 to avoid the error entirely (it will disable multiprocessing)
  2. Maybe try to put your transform in a module?

After hours of struggle I managed to add a per image custom transformation, after finally finding this post. But what I really want to do is apply that transformation per mini-batch, as normalize seems to do. Is there any such place to break into the image pipeline?

I’d really appreciate a clear, complete, cookbook code example, if one is possible. To be simple and concrete,
_tfms = get_transforms(do_flip=True, flip_vert=True, max_zoom=0, p_affine=0, p_lighting=0)

Now alter _tfms such that each image pixel value is divided by 2.

Thanks for helping.

When you create a DataBunch, you can pass a set of ds_tfms (usually the result of get_transforms()) and dl_tfms, which will be applied to the dataloaders and allow you to make tranformations on a mini batch.

It must be a function that takes a batch xb,yb and returns the modified version. One example is the function batch_to_half that is used to put the input in FP16 for mixed-precision training.

You can also add/remove it from a specific dataloader with the functions add_tfm and remove_tfm.


Thanks! That’s exactly what I needed.

Just to mention that I have exactly the same AttributeError as Karl with my custom transform wrapped in TfmPixel.