Addressing 1 and 2 directly in the notebook. For 3, yes you would need to write a custom show method, or create more axes to fit them (look at the show_results method for examples).
@sgugger I just wanted to thank you for the Siamese tutorial Itās an excellent overview of how you can bring in any custom bit you want, and explains everything so clearly (to me). Thank you so much for doing that!
(Yes I finally just got around to getting to it )
Only one question I have, if weāre wanting to augment the data, how would I go about only augmenting the first of the two images in our ImageTuple
? Is there an easy way for this? (IE any augmentation is done only on the left image, the right image stays the same, such as Flip() etc) I donāt think it would be do to the type-delegation, a custom one would probably need to be introduced that patches the augmentations Iād like to use
No real easy way for now. The only way I can think of would be to define a new type and patch the behavior to existing Transform
.
Good to know Iām on the same page Thanks!
In the siamese notebook would it be possible to have a DataBlock with two ImageBlocks and then a Categorical block for the y? I assume you would have done it that way if you could, but not sure what would prevent that from working.
I was also curious about the same.
@sgugger small correction in the siamese_tutorial notebook: open_image
should use fname
which is currently hardcoded for files[0]
When I have time Iāll show this working via patch
I had to do something similar for style transfer, take a look here, look for NormalizeX
. I needed to apply the transform only on the input data.
The transform is only applied to TensorImageX
, which in turn is created by PILImageX
.
You could also do this yes. I can add that as an example.
@typedispatch
def show_batch(x:ImageTuple, y, samples, ctxs=None, max_n=6, rows=None, cols=2, figsize=None, **kwargs):
if figsize is None: figsize = (cols*6, max_n//cols * 3)
if ctxs is None: ctxs = get_grid(min(len(samples), max_n), rows=rows, cols=cols, figsize=figsize)
ctxs = show_batch[object](x, y, samples, ctxs=ctxs, max_n=max_n, **kwargs)
return ctxs
Can someone explain the meaning of show_batch[object](...)
in this method? I can see the method is calling itself, but whatās the use [object]
?
show_batch[object]
is the default implementation of show_batch
(that can be found in data.core
).
Hello all!
I am trying to use Datasets / Dataloaders in Vision, so, later I can combine it with tabular and create a mixed model.
I am struggling to convert this:
tfms = partial(aug_transforms,
max_rotate=15,
max_zoom=1.1,
max_lighting=0.4,
max_warp=0.2,
p_affine=1.,
p_lighting=1.)
dls = ImageDataLoaders.from_folder(path,
valid_pct=0.2,
seed=42,
item_tfms=RandomResizedCrop(460, min_scale=0.75),
batch_tfms=[*tfms(size=size, pad_mode=pad_mode, batch=batch), Normalize.from_stats(*imagenet_stats)],
bs=bs, shuffle_train=True)
to this:
def get_x(x): return x
def get_y(x): return parent_label(x)
tfms = [[get_x, PILImage.create],
[get_y, Categorize()]]
dsets = Datasets(imgs, tfms, splits=RandomSplitter(seed=42)(imgs))
dls = dsets.dataloaders(bs=8, source=imgs, #num_workers=8,
after_item = [RandomResizedCrop(460, min_scale=0.75), ToTensor()],
after_batch=[IntToFloatTensor(),
Resize(size=448),
Rotate(max_deg=15, p=1., pad_mode='reflection', batch=False),
Zoom(max_zoom=1.1, p=1., pad_mode='reflection', batch=False),
Warp(magnitude=0.2, p=1., pad_mode='reflection', batch=False),
Brightness(max_lighting=0.4, p=1., batch=False),
Normalize.from_stats(*imagenet_stats)],
shuffle_train=True
)
In fact, the dataloader is created and I can show a batch and even train a model. But the results after training are much worse in the second case. I suspect it has something to do with validation set augmentation, but I am not sure. Any thoughts, @muellerzr ?
Note, any GPU transform makes the following assumption:
- Everything must be in a batch, so everything must be in the same size already
So if all your input images are the same size, then you can use RandomResizedCropGPU. For example, if not we could do something like so:
item_tfms = [RandomResizedCrop(sz1)
batch_tfms = [RandomResizedCropGPU(sz2)
Thanks! I was aware of that but somehow believed it should work
Passing a TensorImage
through ConvLayer
should retain the type or not?
im_path = '/content/data/imagewoof2-320/train/n02086240/ILSVRC2012_val_00000907.JPEG'
pipe = Pipeline([PILImage.create, Resize(224), ToTensor, IntToFloatTensor])
timg = pipe(im_path)
conv_layer = ConvLayer(3,32)
conv_img = conv_layer(unsqueeze(timg,dim=0))
I was experimenting with the same and the above code didnāt preserved the types. The output is a plain tensor
object. I also checked the first element of batch but that too, donāt have the original types.
No, Sylvain discussed this but PyTorch doesnāt support this. So once itās at a model level assume it gets raw tensor
Okay, thanks!
I want to customize text_size
and color
of LabeledBBox
. I found that eventually we end up calling show
method of TensorBBox
, so I tried doing:
- Redefine show method and use default args to deal with it.
@TensorBBox
def show(self,ctx=None,color='white',text_size=14, **kwargs):
x = self.view(-1,4)
for b in x: _draw_rect(ctx, b, hw=False, color=color,text_size=text_size, **kwargs)
return ctx
- Passing in as
dls_kwargs
BBoxBlock = TransformBlock(type_tfms=TensorBBox.create, item_tfms=PointScaler, dls_kwargs = {'before_batch': bb_pad, 'text_size': 12})
- Add item_tfms (inspired from
AddMaskCodes
)
class SetAttributes(Transform):
def __init__(self, tsize=14, color='white'):
store_attr(self,'tsize,color')
def decodes(self, o:TensorBBox):
o._meta = {'text_size': self.tsize, 'color': self.color}
return o
And then redefining TensorBBox
to work with _meta
.
Nothing worked. Please let me know how itās done
Point 1 is not supposed to do anything. This syntax on ly works when you want to add a new encodes or decodes to a transform, itās not python syntax otherwise.
If you want a new show method, you should subclass TensorBBox
and change the show method.
So youāre suggesting that subclassing TensorBBox
is the only solution? I see _draw_rect
method has those parameters which I want to customize, so can I do it using some kind of transform which will make sure to pass those parameters to _draw_rect
?
Also, wonāt it mess up all the methods dispatched for TensorBBox
since the type will be changed?
Continuing to 3rd point, Iām using same SetAtrributes
defined here
Passed it to BBoxBlock:
BBoxBlock = TransformBlock(type_tfms=[TensorBBox.create], item_tfms=[PointScaler,SetAttributes(tsize=12,color='red')], dls_kwargs = {'before_batch': bb_pad})
Using it in redefined TensorBBox
class TensorBBox(TensorPoint):
"Basic type for a tensor of bounding boxes in an image"
@classmethod
def create(cls, x, img_size=None)->None: return cls(tensor(x).view(-1, 4).float(), img_size=img_size)
def show(self, ctx=None, **kwargs):
x = self.view(-1,4)
tsize,color = self.get_meta('tsize'),self.get_meta('color')
for b in x: _draw_rect(ctx, b, hw=False,color=color,text_size=tsize, **kwargs)
return ctx
But it failed. Tried to debug it using dblock.summary()
. Here are the error logs:
Building one batch
Applying item_tfms to the first sample:
Pipeline: BBoxLabeler -> PointScaler -> Resize -> SetAttributes -> ToTensor
starting from
(PILImage mode=RGB size=500x375, TensorBBox of size 1x4, TensorMultiCategory([12]))
applying BBoxLabeler gives
(PILImage mode=RGB size=500x375, TensorBBox of size 1x4, TensorMultiCategory([12]))
applying PointScaler failed.
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-53-29e18c57cddf> in <module>()
----> 1 pascal.summary(path/'train')
11 frames
/usr/local/lib/python3.6/dist-packages/fastai2/data/block.py in summary(self, source, bs, **kwargs)
152 if len([f for f in dls.train.after_item.fs if f.name != 'noop'])!=0:
153 print("Applying item_tfms to the first sample:")
--> 154 s = [_apply_pipeline(dls.train.after_item, dsets.train[0])]
155 print(f"\nAdding the next {bs-1} samples")
156 s += [dls.train.after_item(dsets.train[i]) for i in range(1, bs)]
/usr/local/lib/python3.6/dist-packages/fastai2/data/block.py in _apply_pipeline(p, x)
122 except Exception as e:
123 print(f" applying {name} failed.")
--> 124 raise e
125 return x
126
/usr/local/lib/python3.6/dist-packages/fastai2/data/block.py in _apply_pipeline(p, x)
118 name = f.name
119 try:
--> 120 x = f(x)
121 if name != "noop": print(f" applying {name} gives\n {_short_repr(x)}")
122 except Exception as e:
/usr/local/lib/python3.6/dist-packages/fastcore/transform.py in __call__(self, x, **kwargs)
70 @property
71 def name(self): return getattr(self, '_name', _get_name(self))
---> 72 def __call__(self, x, **kwargs): return self._call('encodes', x, **kwargs)
73 def decode (self, x, **kwargs): return self._call('decodes', x, **kwargs)
74 def __repr__(self): return f'{self.name}: {self.encodes} {self.decodes}'
/usr/local/lib/python3.6/dist-packages/fastcore/transform.py in _call(self, fn, x, split_idx, **kwargs)
80 def _call(self, fn, x, split_idx=None, **kwargs):
81 if split_idx!=self.split_idx and self.split_idx is not None: return x
---> 82 return self._do_call(getattr(self, fn), x, **kwargs)
83
84 def _do_call(self, f, x, **kwargs):
/usr/local/lib/python3.6/dist-packages/fastcore/transform.py in _do_call(self, f, x, **kwargs)
85 if not _is_tuple(x):
86 return x if f is None else retain_type(f(x, **kwargs), x, f.returns_none(x))
---> 87 res = tuple(self._do_call(f, x_, **kwargs) for x_ in x)
88 return retain_type(res, x)
89
/usr/local/lib/python3.6/dist-packages/fastcore/transform.py in <genexpr>(.0)
85 if not _is_tuple(x):
86 return x if f is None else retain_type(f(x, **kwargs), x, f.returns_none(x))
---> 87 res = tuple(self._do_call(f, x_, **kwargs) for x_ in x)
88 return retain_type(res, x)
89
/usr/local/lib/python3.6/dist-packages/fastcore/transform.py in _do_call(self, f, x, **kwargs)
84 def _do_call(self, f, x, **kwargs):
85 if not _is_tuple(x):
---> 86 return x if f is None else retain_type(f(x, **kwargs), x, f.returns_none(x))
87 res = tuple(self._do_call(f, x_, **kwargs) for x_ in x)
88 return retain_type(res, x)
/usr/local/lib/python3.6/dist-packages/fastcore/dispatch.py in __call__(self, *args, **kwargs)
96 if not f: return args[0]
97 if self.inst is not None: f = MethodType(f, self.inst)
---> 98 return f(*args, **kwargs)
99
100 def __get__(self, inst, owner):
/usr/local/lib/python3.6/dist-packages/fastai2/vision/core.py in encodes(self, x)
242 def decodes(self, x:(PILBase,TensorImageBase)): return self._grab_sz(x)
243
--> 244 def encodes(self, x:TensorPoint): return _scale_pnts(x, self._get_sz(x), self.do_scale, self.y_first)
245 def decodes(self, x:TensorPoint): return _unscale_pnts(x.view(-1, 2), self._get_sz(x))
246
/usr/local/lib/python3.6/dist-packages/fastai2/vision/core.py in _scale_pnts(y, sz, do_scale, y_first)
215 def _scale_pnts(y, sz, do_scale=True, y_first=False):
216 if y_first: y = y.flip(1)
--> 217 res = y * 2/tensor(sz).float() - 1 if do_scale else y
218 return TensorPoint(res, img_size=sz)
219
/usr/local/lib/python3.6/dist-packages/fastai2/torch_core.py in _f(self, *args, **kwargs)
270 def _f(self, *args, **kwargs):
271 cls = self.__class__
--> 272 res = getattr(super(TensorBase, self), fn)(*args, **kwargs)
273 return retain_type(res, self)
274 return _f
RuntimeError: The size of tensor a (4) must match the size of tensor b (2) at non-singleton dimension 1
Trying to figure out based on those tensor sizes, I found these methods which might be failing.