Fastai v2 vision

Point 1 is not supposed to do anything. This syntax on ly works when you want to add a new encodes or decodes to a transform, it’s not python syntax otherwise.
If you want a new show method, you should subclass TensorBBox and change the show method.

So you’re suggesting that subclassing TensorBBox is the only solution? I see _draw_rect method has those parameters which I want to customize, so can I do it using some kind of transform which will make sure to pass those parameters to _draw_rect?

Also, won’t it mess up all the methods dispatched for TensorBBox since the type will be changed?

Continuing to 3rd point, I’m using same SetAtrributes defined here

Passed it to BBoxBlock:

BBoxBlock = TransformBlock(type_tfms=[TensorBBox.create], item_tfms=[PointScaler,SetAttributes(tsize=12,color='red')], dls_kwargs = {'before_batch': bb_pad})

Using it in redefined TensorBBox

class TensorBBox(TensorPoint):
    "Basic type for a tensor of bounding boxes in an image"
    @classmethod
    def create(cls, x, img_size=None)->None: return cls(tensor(x).view(-1, 4).float(), img_size=img_size)

    def show(self, ctx=None, **kwargs):
        x = self.view(-1,4)
        tsize,color = self.get_meta('tsize'),self.get_meta('color')
        for b in x: _draw_rect(ctx, b, hw=False,color=color,text_size=tsize, **kwargs)
        return ctx

But it failed. Tried to debug it using dblock.summary(). Here are the error logs:

Building one batch
Applying item_tfms to the first sample:
  Pipeline: BBoxLabeler -> PointScaler -> Resize -> SetAttributes -> ToTensor
    starting from
      (PILImage mode=RGB size=500x375, TensorBBox of size 1x4, TensorMultiCategory([12]))
    applying BBoxLabeler gives
      (PILImage mode=RGB size=500x375, TensorBBox of size 1x4, TensorMultiCategory([12]))
    applying PointScaler failed.

---------------------------------------------------------------------------

RuntimeError                              Traceback (most recent call last)

<ipython-input-53-29e18c57cddf> in <module>()
----> 1 pascal.summary(path/'train')

11 frames

/usr/local/lib/python3.6/dist-packages/fastai2/data/block.py in summary(self, source, bs, **kwargs)
    152     if len([f for f in dls.train.after_item.fs if f.name != 'noop'])!=0:
    153         print("Applying item_tfms to the first sample:")
--> 154         s = [_apply_pipeline(dls.train.after_item, dsets.train[0])]
    155         print(f"\nAdding the next {bs-1} samples")
    156         s += [dls.train.after_item(dsets.train[i]) for i in range(1, bs)]

/usr/local/lib/python3.6/dist-packages/fastai2/data/block.py in _apply_pipeline(p, x)
    122         except Exception as e:
    123             print(f"    applying {name} failed.")
--> 124             raise e
    125     return x
    126 

/usr/local/lib/python3.6/dist-packages/fastai2/data/block.py in _apply_pipeline(p, x)
    118         name = f.name
    119         try:
--> 120             x = f(x)
    121             if name != "noop": print(f"    applying {name} gives\n      {_short_repr(x)}")
    122         except Exception as e:

/usr/local/lib/python3.6/dist-packages/fastcore/transform.py in __call__(self, x, **kwargs)
     70     @property
     71     def name(self): return getattr(self, '_name', _get_name(self))
---> 72     def __call__(self, x, **kwargs): return self._call('encodes', x, **kwargs)
     73     def decode  (self, x, **kwargs): return self._call('decodes', x, **kwargs)
     74     def __repr__(self): return f'{self.name}: {self.encodes} {self.decodes}'

/usr/local/lib/python3.6/dist-packages/fastcore/transform.py in _call(self, fn, x, split_idx, **kwargs)
     80     def _call(self, fn, x, split_idx=None, **kwargs):
     81         if split_idx!=self.split_idx and self.split_idx is not None: return x
---> 82         return self._do_call(getattr(self, fn), x, **kwargs)
     83 
     84     def _do_call(self, f, x, **kwargs):

/usr/local/lib/python3.6/dist-packages/fastcore/transform.py in _do_call(self, f, x, **kwargs)
     85         if not _is_tuple(x):
     86             return x if f is None else retain_type(f(x, **kwargs), x, f.returns_none(x))
---> 87         res = tuple(self._do_call(f, x_, **kwargs) for x_ in x)
     88         return retain_type(res, x)
     89 

/usr/local/lib/python3.6/dist-packages/fastcore/transform.py in <genexpr>(.0)
     85         if not _is_tuple(x):
     86             return x if f is None else retain_type(f(x, **kwargs), x, f.returns_none(x))
---> 87         res = tuple(self._do_call(f, x_, **kwargs) for x_ in x)
     88         return retain_type(res, x)
     89 

/usr/local/lib/python3.6/dist-packages/fastcore/transform.py in _do_call(self, f, x, **kwargs)
     84     def _do_call(self, f, x, **kwargs):
     85         if not _is_tuple(x):
---> 86             return x if f is None else retain_type(f(x, **kwargs), x, f.returns_none(x))
     87         res = tuple(self._do_call(f, x_, **kwargs) for x_ in x)
     88         return retain_type(res, x)

/usr/local/lib/python3.6/dist-packages/fastcore/dispatch.py in __call__(self, *args, **kwargs)
     96         if not f: return args[0]
     97         if self.inst is not None: f = MethodType(f, self.inst)
---> 98         return f(*args, **kwargs)
     99 
    100     def __get__(self, inst, owner):

/usr/local/lib/python3.6/dist-packages/fastai2/vision/core.py in encodes(self, x)
    242     def decodes(self, x:(PILBase,TensorImageBase)): return self._grab_sz(x)
    243 
--> 244     def encodes(self, x:TensorPoint): return _scale_pnts(x, self._get_sz(x), self.do_scale, self.y_first)
    245     def decodes(self, x:TensorPoint): return _unscale_pnts(x.view(-1, 2), self._get_sz(x))
    246 

/usr/local/lib/python3.6/dist-packages/fastai2/vision/core.py in _scale_pnts(y, sz, do_scale, y_first)
    215 def _scale_pnts(y, sz, do_scale=True, y_first=False):
    216     if y_first: y = y.flip(1)
--> 217     res = y * 2/tensor(sz).float() - 1 if do_scale else y
    218     return TensorPoint(res, img_size=sz)
    219 

/usr/local/lib/python3.6/dist-packages/fastai2/torch_core.py in _f(self, *args, **kwargs)
    270         def _f(self, *args, **kwargs):
    271             cls = self.__class__
--> 272             res = getattr(super(TensorBase, self), fn)(*args, **kwargs)
    273             return retain_type(res, self)
    274         return _f

RuntimeError: The size of tensor a (4) must match the size of tensor b (2) at non-singleton dimension 1

Trying to figure out based on those tensor sizes, I found these methods which might be failing.

Can we use BBoxBlock independently without BBoxLblBlock?

I tried doing so but it throws some error with bb_pad. I tried removing that method from pipeline but then it started complaining about invalid shapes despite of all being same.

Collating items in a batch
Error! It's not possible to collate your items in a batch
Could not collate the 0-th members of your tuples because got the following shapes
torch.Size([3, 224, 224]),torch.Size([3, 224, 224]),torch.Size([3, 224, 224]),torch.Size([3, 224, 224])

I can post whole stacktrace if you need.

EDIT: Passing in list of dummy labels of required length passes the pipeline, is that how it should be done?

If you use it on its own, you have to write your own padding functions. bb_pad is meant to work with labeled bouding boxes, so you need to adapt it.

1 Like

I did try to adapt bb_pad by passing in empty string as a label. I’ve also modified the L1Loss to work with multiple outputs. Now I’m getting bool value of Tensor with more than one value is ambiguous error. I’ve posted more about it under similar topic here

Just a general question, would you guys (devs) or others in general be interested in a PR that includes a utility function for transferring weights from similar architectures? IE transfering a model’s weights trained on ImageWoof to a model being used on the PETs dataset?

See this post for a general idea: Loading pretrained weights that are not from ImageNet

2 Likes

There is a function that does the same somewhere in text.models.

Ah that actually makes sense, probably load_encoder? (I’ll look in a little bit) If so, would it be okay moving it to a level higher? (Learner) or would you rather keep it’s functionality there

We can have the utility in torch_core and some access in Learner, yes.

1 Like

Hey guys, just wanted to point out this very cool behavior with get_preds and multiple models!

Let me build a scenario for you, I have two separate models that receive the same image, and I want to do inference with both models. One model does say regression and the other model does classification. Now our presumption would first be I need to make two new dataloaders for each and then I need to run any decoding separately, but that’s unnecessary! Even though I have a dataloader built from the classification model, if I do decode_preds on the regression model, it will work!

Example code snippet that won’t break:

imgs = ['img1.jpg', ...]
dl = classify_learner.dls.test_dl(imgs)
a,b,c = classify_learner.get_preds(dl=dl, with_decoded=True)
a,b,c = reg_learner.get_preds(dl=dl, with_decoded=True)

Very surprised by this (but a welcome one!) I presume it has to do with decodes looks at the type of output and then does the decode_batch, which is a very cool behavior :slight_smile:

1 Like

The decoding is only the one from the loss function, that is why it works :wink:

1 Like

I had similar requirement in one of my project, I was wanted to export the weights of encoder of Unet and use that for classification problem. One additional requirement was that I had few additional bottleneck layers after the encoder, which should also be exported in the same .pth file. I feel the way I did it is somewhat hairy and would be great if you could suggest a better of doing the same :slightly_smiling_face:

  1. Save weights:
encoder = nn.Sequential(*learn.model.layers[:4]) // For additional bottleneck layers
checkpoint = {
    "arch": encoder,
    "model": encoder.state_dict()
}
torch.save(checkpoint, encoder.pth')

(If your requirement is to export only the encoder, you could do simply learn.model[0])

  1. Load weights for different task
ckpt = torch.load('encoder.pth')
encoder = ckpt['arch']
encoder.load_state_dict(ckpt['model'])
# output:
# <All keys matched successfully>

One caveat though: It need any custom layer/layer block definitions to be imported.(eg. ResBlock)

It would be great if this functionality becomes the part of library :smiley:

@sgugger @jeremy Hello! Have you heard by chance of this project: https://github.com/microsoft/unilm/tree/master/layoutlm ?

Not able to reproduce the results using learn.load

saved_model = SaveModelCallback(fname=exp_name)

I’m using SaveModelCallback to save best model while training. After done with the training, I tried deleting and re-instantiating the learner with random weights and then ran learn.validate() (results went down as expected)

Then I performed learn.load(exp_name) and ran learn.validate(), still the results are as worse as the random weights.

EDIT: Saving it explicitly using learn.save and then loading it back reproduces the results

deep and deeper versions of xresnet and xse_resnext produce following error when used with pool=MaxPool.

RuntimeError                              Traceback (most recent call last)

<ipython-input-44-0aadd6b2d7d1> in <module>()
----> 1 body(torch.randn(8,3,288,288)).shape

5 frames

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/container.py in forward(self, input)
     98     def forward(self, input):
     99         for module in self:
--> 100             input = module(input)
    101         return input
    102 

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/container.py in forward(self, input)
     98     def forward(self, input):
     99         for module in self:
--> 100             input = module(input)
    101         return input
    102 

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

/usr/local/lib/python3.6/dist-packages/fastai2/layers.py in forward(self, x)
    552         self.act = defaults.activation(inplace=True) if act_cls is defaults.activation else act_cls()
    553 
--> 554     def forward(self, x): return self.act(self.convpath(x) + self.idpath(x))
    555 
    556 # Cell

RuntimeError: The size of tensor a (5) must match the size of tensor b (4) at non-singleton dimension 3

Code to reproduce the results:

arch = partial(xresnet34_deep, pool=MaxPool)
body = create_body(arch,cut=-4)
body(torch.randn(8,3,288,288))

I have a short question: does the Resize parameter in item-tfms apply to just the images, or does it also resize the bounding boxes? Thanks for your advice

The vision transforms use type dispatching to handle the various types (bounding boxes, keypoints, segmentation, etc) so yes :slight_smile: however you should use a padding method IIRC to not lose any data

Edit: yes you should use ResizeMethod.Squish

Abysmal performance by xresnet34 over vanilla resnet34 on ‘Stanford-dogs’ dataset.

xresnet34:

resnet34:

Any reason why this is happening?

Colab link: https://colab.research.google.com/drive/1zRk0nRMdtLZdiuKpEzJAymsZUgWZg9-b

I think it’s due to the fact that there’s no pre-trained version. See this comment.

1 Like

So atleast xresnet50 should work with pretrained weights or that might fail as well? I my previous experiments, I tried training xresnet34 from scratch, it went from 3% to 26% and loss went ‘nan’ after 20 epochs. (I was using ranger optimizer with lr=1e-4)

In general, is it safe to start with xresnet models while working on new dataset/augmentation/loss function?

What are the recommended optimizers and lr_schedulers while training for large no. of epochs (50+)