Fastai v2 vision

Addressing 1 and 2 directly in the notebook. For 3, yes you would need to write a custom show method, or create more axes to fit them (look at the show_results method for examples).

@sgugger I just wanted to thank you for the Siamese tutorial :slight_smile: Itā€™s an excellent overview of how you can bring in any custom bit you want, and explains everything so clearly (to me). Thank you so much for doing that! :slight_smile:

(Yes I finally just got around to getting to it :wink: )

Only one question I have, if weā€™re wanting to augment the data, how would I go about only augmenting the first of the two images in our ImageTuple? Is there an easy way for this? (IE any augmentation is done only on the left image, the right image stays the same, such as Flip() etc) I donā€™t think it would be do to the type-delegation, a custom one would probably need to be introduced that patches the augmentations Iā€™d like to use

No real easy way for now. The only way I can think of would be to define a new type and patch the behavior to existing Transform.

1 Like

Good to know Iā€™m on the same page :wink: Thanks!

In the siamese notebook would it be possible to have a DataBlock with two ImageBlocks and then a Categorical block for the y? I assume you would have done it that way if you could, but not sure what would prevent that from working.

1 Like

I was also curious about the same.

@sgugger small correction in the siamese_tutorial notebook: open_image should use fname which is currently hardcoded for files[0]

When I have time Iā€™ll show this working via patch :slight_smile:

1 Like

I had to do something similar for style transfer, take a look here, look for NormalizeX. I needed to apply the transform only on the input data.

The transform is only applied to TensorImageX, which in turn is created by PILImageX.

You could also do this yes. I can add that as an example.

2 Likes
@typedispatch
def show_batch(x:ImageTuple, y, samples, ctxs=None, max_n=6, rows=None, cols=2, figsize=None, **kwargs):
    if figsize is None: figsize = (cols*6, max_n//cols * 3)
    if ctxs is None: ctxs = get_grid(min(len(samples), max_n), rows=rows, cols=cols, figsize=figsize)
    ctxs = show_batch[object](x, y, samples, ctxs=ctxs, max_n=max_n, **kwargs)
    return ctxs

Can someone explain the meaning of show_batch[object](...) in this method? I can see the method is calling itself, but whatā€™s the use [object] ?

show_batch[object] is the default implementation of show_batch (that can be found in data.core).

2 Likes

Hello all!

I am trying to use Datasets / Dataloaders in Vision, so, later I can combine it with tabular and create a mixed model.

I am struggling to convert this:

tfms = partial(aug_transforms, 
               max_rotate=15, 
               max_zoom=1.1,
               max_lighting=0.4, 
               max_warp=0.2,
               p_affine=1., 
               p_lighting=1.)

dls = ImageDataLoaders.from_folder(path, 
                                   valid_pct=0.2, 
                                   seed=42, 
                                   item_tfms=RandomResizedCrop(460, min_scale=0.75), 
                                   batch_tfms=[*tfms(size=size, pad_mode=pad_mode, batch=batch), Normalize.from_stats(*imagenet_stats)],
                                   bs=bs, shuffle_train=True)

to this:

def get_x(x): return x
def get_y(x): return parent_label(x)

tfms = [[get_x, PILImage.create], 
        [get_y, Categorize()]]

dsets = Datasets(imgs, tfms, splits=RandomSplitter(seed=42)(imgs))
dls = dsets.dataloaders(bs=8, source=imgs, #num_workers=8, 
                        after_item = [RandomResizedCrop(460, min_scale=0.75), ToTensor()],
                        after_batch=[IntToFloatTensor(), 
                                     Resize(size=448),
                                     Rotate(max_deg=15, p=1., pad_mode='reflection', batch=False), 
                                     Zoom(max_zoom=1.1, p=1., pad_mode='reflection', batch=False), 
                                     Warp(magnitude=0.2, p=1., pad_mode='reflection', batch=False), 
                                     Brightness(max_lighting=0.4, p=1., batch=False), 
                                     Normalize.from_stats(*imagenet_stats)],
                        shuffle_train=True
                       )

In fact, the dataloader is created and I can show a batch and even train a model. But the results after training are much worse in the second case. I suspect it has something to do with validation set augmentation, but I am not sure. Any thoughts, @muellerzr ?

Note, any GPU transform makes the following assumption:

  • Everything must be in a batch, so everything must be in the same size already

So if all your input images are the same size, then you can use RandomResizedCropGPU. For example, if not we could do something like so:

item_tfms = [RandomResizedCrop(sz1)
batch_tfms = [RandomResizedCropGPU(sz2)
2 Likes

Thanks! I was aware of that but somehow believed it should work :sweat_smile:

Passing a TensorImage through ConvLayer should retain the type or not?

im_path = '/content/data/imagewoof2-320/train/n02086240/ILSVRC2012_val_00000907.JPEG'
pipe = Pipeline([PILImage.create, Resize(224), ToTensor, IntToFloatTensor])
timg = pipe(im_path)
conv_layer = ConvLayer(3,32)
conv_img = conv_layer(unsqueeze(timg,dim=0))

I was experimenting with the same and the above code didnā€™t preserved the types. The output is a plain tensor object. I also checked the first element of batch but that too, donā€™t have the original types.

No, Sylvain discussed this but PyTorch doesnā€™t support this. So once itā€™s at a model level assume it gets raw tensor

3 Likes

Okay, thanks!

I want to customize text_size and color of LabeledBBox. I found that eventually we end up calling show method of TensorBBox, so I tried doing:

  1. Redefine show method and use default args to deal with it.
@TensorBBox
def show(self,ctx=None,color='white',text_size=14, **kwargs):
  x = self.view(-1,4)
  for b in x: _draw_rect(ctx, b, hw=False, color=color,text_size=text_size, **kwargs)
  return ctx
  1. Passing in as dls_kwargs
BBoxBlock = TransformBlock(type_tfms=TensorBBox.create, item_tfms=PointScaler, dls_kwargs = {'before_batch': bb_pad, 'text_size': 12})
  1. Add item_tfms (inspired from AddMaskCodes)
class SetAttributes(Transform):
  def __init__(self, tsize=14, color='white'):
    store_attr(self,'tsize,color')
  def decodes(self, o:TensorBBox):
    o._meta = {'text_size': self.tsize, 'color': self.color}
    return o

And then redefining TensorBBox to work with _meta.

Nothing worked. Please let me know how itā€™s done

Point 1 is not supposed to do anything. This syntax on ly works when you want to add a new encodes or decodes to a transform, itā€™s not python syntax otherwise.
If you want a new show method, you should subclass TensorBBox and change the show method.

So youā€™re suggesting that subclassing TensorBBox is the only solution? I see _draw_rect method has those parameters which I want to customize, so can I do it using some kind of transform which will make sure to pass those parameters to _draw_rect?

Also, wonā€™t it mess up all the methods dispatched for TensorBBox since the type will be changed?

Continuing to 3rd point, Iā€™m using same SetAtrributes defined here

Passed it to BBoxBlock:

BBoxBlock = TransformBlock(type_tfms=[TensorBBox.create], item_tfms=[PointScaler,SetAttributes(tsize=12,color='red')], dls_kwargs = {'before_batch': bb_pad})

Using it in redefined TensorBBox

class TensorBBox(TensorPoint):
    "Basic type for a tensor of bounding boxes in an image"
    @classmethod
    def create(cls, x, img_size=None)->None: return cls(tensor(x).view(-1, 4).float(), img_size=img_size)

    def show(self, ctx=None, **kwargs):
        x = self.view(-1,4)
        tsize,color = self.get_meta('tsize'),self.get_meta('color')
        for b in x: _draw_rect(ctx, b, hw=False,color=color,text_size=tsize, **kwargs)
        return ctx

But it failed. Tried to debug it using dblock.summary(). Here are the error logs:

Building one batch
Applying item_tfms to the first sample:
  Pipeline: BBoxLabeler -> PointScaler -> Resize -> SetAttributes -> ToTensor
    starting from
      (PILImage mode=RGB size=500x375, TensorBBox of size 1x4, TensorMultiCategory([12]))
    applying BBoxLabeler gives
      (PILImage mode=RGB size=500x375, TensorBBox of size 1x4, TensorMultiCategory([12]))
    applying PointScaler failed.

---------------------------------------------------------------------------

RuntimeError                              Traceback (most recent call last)

<ipython-input-53-29e18c57cddf> in <module>()
----> 1 pascal.summary(path/'train')

11 frames

/usr/local/lib/python3.6/dist-packages/fastai2/data/block.py in summary(self, source, bs, **kwargs)
    152     if len([f for f in dls.train.after_item.fs if f.name != 'noop'])!=0:
    153         print("Applying item_tfms to the first sample:")
--> 154         s = [_apply_pipeline(dls.train.after_item, dsets.train[0])]
    155         print(f"\nAdding the next {bs-1} samples")
    156         s += [dls.train.after_item(dsets.train[i]) for i in range(1, bs)]

/usr/local/lib/python3.6/dist-packages/fastai2/data/block.py in _apply_pipeline(p, x)
    122         except Exception as e:
    123             print(f"    applying {name} failed.")
--> 124             raise e
    125     return x
    126 

/usr/local/lib/python3.6/dist-packages/fastai2/data/block.py in _apply_pipeline(p, x)
    118         name = f.name
    119         try:
--> 120             x = f(x)
    121             if name != "noop": print(f"    applying {name} gives\n      {_short_repr(x)}")
    122         except Exception as e:

/usr/local/lib/python3.6/dist-packages/fastcore/transform.py in __call__(self, x, **kwargs)
     70     @property
     71     def name(self): return getattr(self, '_name', _get_name(self))
---> 72     def __call__(self, x, **kwargs): return self._call('encodes', x, **kwargs)
     73     def decode  (self, x, **kwargs): return self._call('decodes', x, **kwargs)
     74     def __repr__(self): return f'{self.name}: {self.encodes} {self.decodes}'

/usr/local/lib/python3.6/dist-packages/fastcore/transform.py in _call(self, fn, x, split_idx, **kwargs)
     80     def _call(self, fn, x, split_idx=None, **kwargs):
     81         if split_idx!=self.split_idx and self.split_idx is not None: return x
---> 82         return self._do_call(getattr(self, fn), x, **kwargs)
     83 
     84     def _do_call(self, f, x, **kwargs):

/usr/local/lib/python3.6/dist-packages/fastcore/transform.py in _do_call(self, f, x, **kwargs)
     85         if not _is_tuple(x):
     86             return x if f is None else retain_type(f(x, **kwargs), x, f.returns_none(x))
---> 87         res = tuple(self._do_call(f, x_, **kwargs) for x_ in x)
     88         return retain_type(res, x)
     89 

/usr/local/lib/python3.6/dist-packages/fastcore/transform.py in <genexpr>(.0)
     85         if not _is_tuple(x):
     86             return x if f is None else retain_type(f(x, **kwargs), x, f.returns_none(x))
---> 87         res = tuple(self._do_call(f, x_, **kwargs) for x_ in x)
     88         return retain_type(res, x)
     89 

/usr/local/lib/python3.6/dist-packages/fastcore/transform.py in _do_call(self, f, x, **kwargs)
     84     def _do_call(self, f, x, **kwargs):
     85         if not _is_tuple(x):
---> 86             return x if f is None else retain_type(f(x, **kwargs), x, f.returns_none(x))
     87         res = tuple(self._do_call(f, x_, **kwargs) for x_ in x)
     88         return retain_type(res, x)

/usr/local/lib/python3.6/dist-packages/fastcore/dispatch.py in __call__(self, *args, **kwargs)
     96         if not f: return args[0]
     97         if self.inst is not None: f = MethodType(f, self.inst)
---> 98         return f(*args, **kwargs)
     99 
    100     def __get__(self, inst, owner):

/usr/local/lib/python3.6/dist-packages/fastai2/vision/core.py in encodes(self, x)
    242     def decodes(self, x:(PILBase,TensorImageBase)): return self._grab_sz(x)
    243 
--> 244     def encodes(self, x:TensorPoint): return _scale_pnts(x, self._get_sz(x), self.do_scale, self.y_first)
    245     def decodes(self, x:TensorPoint): return _unscale_pnts(x.view(-1, 2), self._get_sz(x))
    246 

/usr/local/lib/python3.6/dist-packages/fastai2/vision/core.py in _scale_pnts(y, sz, do_scale, y_first)
    215 def _scale_pnts(y, sz, do_scale=True, y_first=False):
    216     if y_first: y = y.flip(1)
--> 217     res = y * 2/tensor(sz).float() - 1 if do_scale else y
    218     return TensorPoint(res, img_size=sz)
    219 

/usr/local/lib/python3.6/dist-packages/fastai2/torch_core.py in _f(self, *args, **kwargs)
    270         def _f(self, *args, **kwargs):
    271             cls = self.__class__
--> 272             res = getattr(super(TensorBase, self), fn)(*args, **kwargs)
    273             return retain_type(res, self)
    274         return _f

RuntimeError: The size of tensor a (4) must match the size of tensor b (2) at non-singleton dimension 1

Trying to figure out based on those tensor sizes, I found these methods which might be failing.