Fastai v2 vision

kshitijpatil09 · March 20, 2020, 7:54pm

Can we use BBoxBlock independently without BBoxLblBlock?

I tried doing so but it throws some error with bb_pad. I tried removing that method from pipeline but then it started complaining about invalid shapes despite of all being same.

Collating items in a batch
Error! It's not possible to collate your items in a batch
Could not collate the 0-th members of your tuples because got the following shapes
torch.Size([3, 224, 224]),torch.Size([3, 224, 224]),torch.Size([3, 224, 224]),torch.Size([3, 224, 224])

I can post whole stacktrace if you need.

EDIT: Passing in list of dummy labels of required length passes the pipeline, is that how it should be done?

sgugger · March 20, 2020, 9:35pm

If you use it on its own, you have to write your own padding functions. bb_pad is meant to work with labeled bouding boxes, so you need to adapt it.

kshitijpatil09 · March 20, 2020, 10:23pm

I did try to adapt bb_pad by passing in empty string as a label. I’ve also modified the L1Loss to work with multiple outputs. Now I’m getting bool value of Tensor with more than one value is ambiguous error. I’ve posted more about it under similar topic here

muellerzr · March 21, 2020, 1:43pm

Just a general question, would you guys (devs) or others in general be interested in a PR that includes a utility function for transferring weights from similar architectures? IE transfering a model’s weights trained on ImageWoof to a model being used on the PETs dataset?

See this post for a general idea: Loading pretrained weights that are not from ImageNet

sgugger · March 21, 2020, 1:48pm

There is a function that does the same somewhere in text.models.

muellerzr · March 21, 2020, 1:49pm

Ah that actually makes sense, probably load_encoder? (I’ll look in a little bit) If so, would it be okay moving it to a level higher? (Learner) or would you rather keep it’s functionality there

sgugger · March 21, 2020, 1:51pm

We can have the utility in torch_core and some access in Learner, yes.

muellerzr · March 21, 2020, 4:02pm

Hey guys, just wanted to point out this very cool behavior with get_preds and multiple models!

Let me build a scenario for you, I have two separate models that receive the same image, and I want to do inference with both models. One model does say regression and the other model does classification. Now our presumption would first be I need to make two new dataloaders for each and then I need to run any decoding separately, but that’s unnecessary! Even though I have a dataloader built from the classification model, if I do decode_preds on the regression model, it will work!

Example code snippet that won’t break:

imgs = ['img1.jpg', ...]
dl = classify_learner.dls.test_dl(imgs)
a,b,c = classify_learner.get_preds(dl=dl, with_decoded=True)
a,b,c = reg_learner.get_preds(dl=dl, with_decoded=True)

Very surprised by this (but a welcome one!) I presume it has to do with decodes looks at the type of output and then does the decode_batch, which is a very cool behavior

sgugger · March 21, 2020, 5:34pm

The decoding is only the one from the loss function, that is why it works

kshitijpatil09 · March 22, 2020, 3:39pm

I had similar requirement in one of my project, I was wanted to export the weights of encoder of Unet and use that for classification problem. One additional requirement was that I had few additional bottleneck layers after the encoder, which should also be exported in the same .pth file. I feel the way I did it is somewhat hairy and would be great if you could suggest a better of doing the same

Save weights:

encoder = nn.Sequential(*learn.model.layers[:4]) // For additional bottleneck layers
checkpoint = {
    "arch": encoder,
    "model": encoder.state_dict()
}
torch.save(checkpoint, encoder.pth')

(If your requirement is to export only the encoder, you could do simply learn.model[0])

Load weights for different task

ckpt = torch.load('encoder.pth')
encoder = ckpt['arch']
encoder.load_state_dict(ckpt['model'])
# output:
# <All keys matched successfully>

One caveat though: It need any custom layer/layer block definitions to be imported.(eg. ResBlock)

It would be great if this functionality becomes the part of library

lvaleriu · March 28, 2020, 7:09pm

@sgugger @jeremy Hello! Have you heard by chance of this project: https://github.com/microsoft/unilm/tree/master/layoutlm ?

kshitijpatil09 · April 1, 2020, 8:46pm

Not able to reproduce the results using learn.load

saved_model = SaveModelCallback(fname=exp_name)

I’m using SaveModelCallback to save best model while training. After done with the training, I tried deleting and re-instantiating the learner with random weights and then ran learn.validate() (results went down as expected)

Then I performed learn.load(exp_name) and ran learn.validate(), still the results are as worse as the random weights.

EDIT: Saving it explicitly using learn.save and then loading it back reproduces the results

kshitijpatil09 · April 4, 2020, 7:46pm

deep and deeper versions of xresnet and xse_resnext produce following error when used with pool=MaxPool.

RuntimeError                              Traceback (most recent call last)

<ipython-input-44-0aadd6b2d7d1> in <module>()
----> 1 body(torch.randn(8,3,288,288)).shape

5 frames

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/container.py in forward(self, input)
     98     def forward(self, input):
     99         for module in self:
--> 100             input = module(input)
    101         return input
    102 

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/container.py in forward(self, input)
     98     def forward(self, input):
     99         for module in self:
--> 100             input = module(input)
    101         return input
    102 

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

/usr/local/lib/python3.6/dist-packages/fastai2/layers.py in forward(self, x)
    552         self.act = defaults.activation(inplace=True) if act_cls is defaults.activation else act_cls()
    553 
--> 554     def forward(self, x): return self.act(self.convpath(x) + self.idpath(x))
    555 
    556 # Cell

RuntimeError: The size of tensor a (5) must match the size of tensor b (4) at non-singleton dimension 3

Code to reproduce the results:

arch = partial(xresnet34_deep, pool=MaxPool)
body = create_body(arch,cut=-4)
body(torch.randn(8,3,288,288))

joseadolfo · April 11, 2020, 8:10pm

I have a short question: does the Resize parameter in item-tfms apply to just the images, or does it also resize the bounding boxes? Thanks for your advice

muellerzr · April 11, 2020, 8:11pm

The vision transforms use type dispatching to handle the various types (bounding boxes, keypoints, segmentation, etc) so yes however you should use a padding method IIRC to not lose any data

github.com

fastai/fastai2/blob/master/fastai2/vision/augment.py#L204


def __init__(self, size, method=ResizeMethod.Crop, pad_mode=PadMode.Reflection,
             resamples=(Image.BILINEAR, Image.NEAREST), **kwargs):
    super().__init__(**kwargs)
    self.size,self.pad_mode,self.method = _process_sz(size),pad_mode,method
    self.mode,self.mode_mask = resamples


def before_call(self, b, split_idx):
    if self.method==ResizeMethod.Squish: return
    self.pcts = (0.5,0.5) if split_idx else (random.random(),random.random())


def encodes(self, x:(Image.Image,TensorBBox,TensorPoint)):
    orig_sz = _get_sz(x)
    self.final_size = self.size
    if self.method==ResizeMethod.Squish:
        return x.crop_pad(orig_sz, Tuple(0,0), orig_sz=orig_sz, pad_mode=self.pad_mode,
               resize_mode=self.mode_mask if isinstance(x,PILMask) else self.mode, resize_to=self.size)


    w,h = orig_sz
    op = (operator.lt,operator.gt)[self.method==ResizeMethod.Pad]
    m = w/self.size[0] if op(w/self.size[0],h/self.size[1]) else h/self.size[1]
    cp_sz = (int(m*self.size[0]),int(m*self.size[1]))

Edit: yes you should use ResizeMethod.Squish

kshitijpatil09 · April 14, 2020, 8:39pm

Abysmal performance by xresnet34 over vanilla resnet34 on ‘Stanford-dogs’ dataset.

xresnet34:

resnet34:

Any reason why this is happening?

Colab link: https://colab.research.google.com/drive/1zRk0nRMdtLZdiuKpEzJAymsZUgWZg9-b

boris · April 14, 2020, 9:12pm

I think it’s due to the fact that there’s no pre-trained version. See this comment.

kshitijpatil09 · April 14, 2020, 9:33pm

So atleast xresnet50 should work with pretrained weights or that might fail as well? I my previous experiments, I tried training xresnet34 from scratch, it went from 3% to 26% and loss went ‘nan’ after 20 epochs. (I was using ranger optimizer with lr=1e-4)

In general, is it safe to start with xresnet models while working on new dataset/augmentation/loss function?

What are the recommended optimizers and lr_schedulers while training for large no. of epochs (50+)

boris · April 14, 2020, 9:39pm

Looking at the log it seems it tried to download xresnet50 instead…
You can clearly add in your learner pretrained=False and it may work (though training will be much longer and it won’t benefit from pretrained networks).

WaterKnight · April 17, 2020, 3:50pm

What are the advanced techniques for UNet right now??? What does the parameter norm from unet_config??