Fastai v2 vision

Thank you @sgugger for the siamese tutorial notebook !

It’s still a work in progress, so please tell me if there are things there that don’t seem clear.

2 Likes

When implementing the show_batch for the x:ImageTuple

def show_batch(x:ImageTuple, y, samples, ctxs=None, max_n=6, rows=None, cols=2, figsize=None, **kwargs):

you are saying:

Here we only dispatch on the x , but we could have custom behaviors depending on the targets.

1°. What do you mean by

only dispatch on the x

?

2°. The x and y in:

ctxs = show_batch[object](x, y, samples, ctxs=ctxs, max_n=max_n, **kwargs)

seem to be useless and can be replaced by None, None. What could we use them for since, as you said, the actual samples are in the samples variable?

3°. If instead of the CategoryBlock I use a different target like MaskBlock (along with the ImageTupleBlock) should I change the show method of the MaskBlock if I want to have the Mask drawn at the right of the ImageTupleBlock (like drawing 3 images in a row) or is it better to have a custom plotting/drawing of the samples instead of calling show_batch[object]?

Thanks!

1 Like

Addressing 1 and 2 directly in the notebook. For 3, yes you would need to write a custom show method, or create more axes to fit them (look at the show_results method for examples).

@sgugger I just wanted to thank you for the Siamese tutorial :slight_smile: It’s an excellent overview of how you can bring in any custom bit you want, and explains everything so clearly (to me). Thank you so much for doing that! :slight_smile:

(Yes I finally just got around to getting to it :wink: )

Only one question I have, if we’re wanting to augment the data, how would I go about only augmenting the first of the two images in our ImageTuple? Is there an easy way for this? (IE any augmentation is done only on the left image, the right image stays the same, such as Flip() etc) I don’t think it would be do to the type-delegation, a custom one would probably need to be introduced that patches the augmentations I’d like to use

No real easy way for now. The only way I can think of would be to define a new type and patch the behavior to existing Transform.

1 Like

Good to know I’m on the same page :wink: Thanks!

In the siamese notebook would it be possible to have a DataBlock with two ImageBlocks and then a Categorical block for the y? I assume you would have done it that way if you could, but not sure what would prevent that from working.

1 Like

I was also curious about the same.

@sgugger small correction in the siamese_tutorial notebook: open_image should use fname which is currently hardcoded for files[0]

When I have time I’ll show this working via patch :slight_smile:

1 Like

I had to do something similar for style transfer, take a look here, look for NormalizeX. I needed to apply the transform only on the input data.

The transform is only applied to TensorImageX, which in turn is created by PILImageX.

You could also do this yes. I can add that as an example.

2 Likes
@typedispatch
def show_batch(x:ImageTuple, y, samples, ctxs=None, max_n=6, rows=None, cols=2, figsize=None, **kwargs):
    if figsize is None: figsize = (cols*6, max_n//cols * 3)
    if ctxs is None: ctxs = get_grid(min(len(samples), max_n), rows=rows, cols=cols, figsize=figsize)
    ctxs = show_batch[object](x, y, samples, ctxs=ctxs, max_n=max_n, **kwargs)
    return ctxs

Can someone explain the meaning of show_batch[object](...) in this method? I can see the method is calling itself, but what’s the use [object] ?

show_batch[object] is the default implementation of show_batch (that can be found in data.core).

2 Likes

Hello all!

I am trying to use Datasets / Dataloaders in Vision, so, later I can combine it with tabular and create a mixed model.

I am struggling to convert this:

tfms = partial(aug_transforms, 
               max_rotate=15, 
               max_zoom=1.1,
               max_lighting=0.4, 
               max_warp=0.2,
               p_affine=1., 
               p_lighting=1.)

dls = ImageDataLoaders.from_folder(path, 
                                   valid_pct=0.2, 
                                   seed=42, 
                                   item_tfms=RandomResizedCrop(460, min_scale=0.75), 
                                   batch_tfms=[*tfms(size=size, pad_mode=pad_mode, batch=batch), Normalize.from_stats(*imagenet_stats)],
                                   bs=bs, shuffle_train=True)

to this:

def get_x(x): return x
def get_y(x): return parent_label(x)

tfms = [[get_x, PILImage.create], 
        [get_y, Categorize()]]

dsets = Datasets(imgs, tfms, splits=RandomSplitter(seed=42)(imgs))
dls = dsets.dataloaders(bs=8, source=imgs, #num_workers=8, 
                        after_item = [RandomResizedCrop(460, min_scale=0.75), ToTensor()],
                        after_batch=[IntToFloatTensor(), 
                                     Resize(size=448),
                                     Rotate(max_deg=15, p=1., pad_mode='reflection', batch=False), 
                                     Zoom(max_zoom=1.1, p=1., pad_mode='reflection', batch=False), 
                                     Warp(magnitude=0.2, p=1., pad_mode='reflection', batch=False), 
                                     Brightness(max_lighting=0.4, p=1., batch=False), 
                                     Normalize.from_stats(*imagenet_stats)],
                        shuffle_train=True
                       )

In fact, the dataloader is created and I can show a batch and even train a model. But the results after training are much worse in the second case. I suspect it has something to do with validation set augmentation, but I am not sure. Any thoughts, @muellerzr ?

Note, any GPU transform makes the following assumption:

  • Everything must be in a batch, so everything must be in the same size already

So if all your input images are the same size, then you can use RandomResizedCropGPU. For example, if not we could do something like so:

item_tfms = [RandomResizedCrop(sz1)
batch_tfms = [RandomResizedCropGPU(sz2)
2 Likes

Thanks! I was aware of that but somehow believed it should work :sweat_smile:

Passing a TensorImage through ConvLayer should retain the type or not?

im_path = '/content/data/imagewoof2-320/train/n02086240/ILSVRC2012_val_00000907.JPEG'
pipe = Pipeline([PILImage.create, Resize(224), ToTensor, IntToFloatTensor])
timg = pipe(im_path)
conv_layer = ConvLayer(3,32)
conv_img = conv_layer(unsqueeze(timg,dim=0))

I was experimenting with the same and the above code didn’t preserved the types. The output is a plain tensor object. I also checked the first element of batch but that too, don’t have the original types.

No, Sylvain discussed this but PyTorch doesn’t support this. So once it’s at a model level assume it gets raw tensor

3 Likes

Okay, thanks!