DataBlock API and item_tfms order of execution?

mfb · September 21, 2020, 5:17am

For a multi-label image classification problem, I was wondering if colorful backgrounds were responsible for poor results, so I hoped to put all images in gray scale before training and classification.

I’ve managed to get this working, but I’m having trouble figuring out if there’s a way to tell the Data Block API the order in which I’d like the transforms to happen.

class ToGrayscaleTransform(Transform):
    """fastai transforms are a piece of work.
    
    item_tfms decide whether to run based on the type.
    so if we didn't have that type annotation on x, it would
    get run on both the image and category label (!).
    """
    def encodes(self, x:Image.Image):
        grey_img = x.convert("L")
        a = np.asarray(grey_img)
        rgb = np.stack([a,a,a], axis=-1)
        print(f"shape of rgb data: {rgb.shape}")
        return Image.fromarray(rgb, mode="RGB")

breeds = DataBlock(
    blocks=(ImageBlock, CategoryBlock),
    get_items=get_image_files,
    splitter=RandomSplitter(valid_pct=0.2, seed=42),
    get_y=parent_label,
    item_tfms=[Resize(128), ToGrayscaleTransform()]
)

I would like (for efficiency) to first Resize down to thumbnail size and then convert to gray scale. However, whether I put item_tfms=[Resize(128), ToGrayscaleTransform()] or item_tfms=[ToGrayscaleTransform(), Resize(128)], summary() and the learners always seem to first run ToGrayscaleTransform() and then run Resize(128):

breeds.summary(path, bs=3, show_batch=False)

outputs:

Building one sample
  Pipeline: PILBase.create
    starting from
      images/dogs/pointer/00000104.jpg
    applying PILBase.create gives
      PILImage mode=RGB size=600x900

...


Building one batch
Applying item_tfms to the first sample:
  Pipeline: ToGrayscaleTransform -> Resize -- {'size': (128, 128), 'method': 'crop', 'pad_mode': 'reflection', 'resamples': (2, 0), 'p': 1.0} -> ToTensor
    starting from
      (PILImage mode=RGB size=600x900, TensorCategory(2))
shape of rgb data: (900, 600, 3)
    applying ToGrayscaleTransform gives
      (PILImage mode=RGB size=600x900, TensorCategory(2))
    applying Resize -- {'size': (128, 128), 'method': 'crop', 'pad_mode': 'reflection', 'resamples': (2, 0), 'p': 1.0} gives
      (PILImage mode=RGB size=128x128, TensorCategory(2))
    applying ToTensor gives
      (TensorImage of size 3x128x128, TensorCategory(2))

Is it intended that item_tfms can’t convey ordering? I noticed that type dispatch means the functions will get assorted variously to the input or labels, so maybe ordering gets lost somewhere in fastcore/fastai?

muellerzr · September 21, 2020, 5:21am

They do have ordering. Each transform has a .order attribute. The higher you go, the further in the Pipeline it’s called. So an order of 0 is called first, an order of 100 is called after Normalize (last)

mfb · September 24, 2020, 2:22am

Thank you very much Zachary, setting the order= arg in the transforms worked great.

mfb · September 24, 2020, 4:05am

One reason for my confusion was that order did not show up as an argument to the constructor of Resize when I typed ‘?Resize’ or used shift-tab in Jupyter.

This is a little odd because Jeremy has gone to great lengths to create and use the @delegates() decorator to try to improve interactive documentation for __init__ functions. I’m using Python 3.8.5, and I inserted prints way down in delegates and it seemed to be doing its magic, but oddly this didn’t help interactive programming. My calls to inspect.signature gives different results for Resize vs Resize.__init__ – any idea if this is something fixed in newer pythons?

> inspect.signature(Resize),
<Signature (self, size, method='crop', pad_mode='reflection',
            resamples=(2, 0), **kwargs)>

> inspect.signature(Resize.__init__)
<Signature (self, size, method='crop', pad_mode='reflection',
            resamples=(2, 0), p=1.0, nm=None, before_call=None,
            enc=None, dec=None, split_idx=None, order=None)>

muellerzr · September 24, 2020, 4:16am

Order is never in the constructor, it is a predefined attribute inside of the classes. Normally you will see it as something like:

def myTransform(Transform):
  order = 2
  def encodes(x:...

This way there is consistency of when transforms show up

mfb · September 24, 2020, 5:19am

order is a parameter to Transform’s __init__, which Resize delegates kwargs to, so it works as a parameter to init. In my example above I can use:

[Resize(128, 128, order=0), ToGrayscaleTransform(order=1)]