Xresnet Transfer Learning

SamJoel · August 27, 2020, 8:50am

Why is transfer learning with xresnet not working.It was said that xresnet is a far better option than normal resnets. But when i use it for transfer learning with cnn_learner it takes huge amount of epochs to converge while normal resnets take just an epoch or two. I tried transfer learning for Pets Dataset using xresnet which just gave me 45%accuracy and 57%error rate even after 25 epochs;while a normal resnet gave me 93%accuracy and 8%error rate in just three epochs.Can somebody explain why transfer learning with xresnet is not a good option.

Kornel · August 27, 2020, 10:47am

can you show your code?

I am using xresnet34 and for all simple cases similar to Imagenet It tend to converge almost immediately.

SamJoel · August 27, 2020, 11:07am

Im sorry just this afternoon i replaced it with resnet34 and trained it all over again.
I do not have the code for the xresnet. But it was similar to all the pretraining we do with fastai.

I replaced the xresnet with resnet thats it. It trained very well and gave me 98% on Cifar 10.

muellerzr · August 27, 2020, 1:21pm

The xresnet’s only have one pretrained model, the 50. (And there’s a few issues with its weights potentially.)

SamJoel · August 27, 2020, 1:34pm

Oh I see. But it seems to download a .pth file when we call it…how’s that happening??

muellerzr · August 27, 2020, 1:44pm

It always will. What may or may not happen is the weights get loaded in. See this comment. It’ll download first then try to use the weights (which will only work on a 50):

github.com

fastai/fastai/blob/master/fastai/vision/models/xresnet.py#L56


                for i,l in enumerate(layers)]

    def _make_layer(self, ni, nf, blocks, stride, sa, **kwargs):
        return nn.Sequential(
            *[self.block(self.expansion, ni if i==0 else nf, nf, stride=stride if i==0 else 1,
                      sa=sa and i==(blocks-1), act_cls=self.act_cls, **kwargs)
              for i in range(blocks)])

# Cell
def _xresnet(pretrained, expansion, layers, **kwargs):
    # TODO pretrain all sizes. Currently will fail with non-xrn50
    url = 'https://s3.amazonaws.com/fast-ai-modelzoo/xrn50_940.pth'
    res = XResNet(ResBlock, expansion, layers, **kwargs)
    if pretrained: res.load_state_dict(load_state_dict_from_url(url, map_location='cpu')['model'], strict=False)
    return res

def xresnet18 (pretrained=False, **kwargs): return _xresnet(pretrained, 1, [2, 2,  2, 2], **kwargs)
def xresnet34 (pretrained=False, **kwargs): return _xresnet(pretrained, 1, [3, 4,  6, 3], **kwargs)
def xresnet50 (pretrained=False, **kwargs): return _xresnet(pretrained, 4, [3, 4,  6, 3], **kwargs)
def xresnet101(pretrained=False, **kwargs): return _xresnet(pretrained, 4, [3, 4, 23, 3], **kwargs)
def xresnet152(pretrained=False, **kwargs): return _xresnet(pretrained, 4, [3, 8, 36, 3], **kwargs)

SamJoel · August 27, 2020, 2:00pm

Oh I get it. Thank you

kofi · August 27, 2020, 3:37pm

Not exactly the problem here, but I noticed that in chapter 7, an xresnet50 model is trained from scratch on imagenette. However, the head of the model was not replaced with a new one (ie. one for imagenette).
I know that cnn_learner does that automatically but what of learner. I’ve taken a look at the source code but I can’t find where that is done.

muellerzr · August 27, 2020, 3:39pm

It’s not here. No pooling head is made, it’s just a final linear layer.

kofi · August 27, 2020, 3:41pm

So does learner change the final linear layer to have 10 outputs to be able to train on imagenette?

muellerzr · August 27, 2020, 3:42pm

Take a look at learn.model to see.

kofi · August 27, 2020, 3:53pm

I looked at it, it still has 1000 output units, but I’m able to train the model

muellerzr · August 27, 2020, 3:56pm

Correct. The reason is we have 1,000 logits. If you had 1,001 classes it would throw an error, however since you have less it will still train. It will train weird mind you (lr find will look strange, and you may notice a higher loss). To fix this pass in n_out=dls.c to your call to xresnet

kofi · August 27, 2020, 4:01pm

Great, that worked thanks. I think that should be explicitly done in the chapter 7 notebook. It may lead others to also think that learner is creating the appropriate head.

muellerzr · August 27, 2020, 4:02pm

Open an issue in the fastbook repo or make a PR please

kofi · August 27, 2020, 4:06pm

Sure thanks

bkj · August 31, 2020, 11:32pm

@SamJoel Are you able to share the script you used to get 93% accuracy on the Pets dataset? I was playing w/ this dataset the other day, and couldn’t get particularly good results – so I wonder whether I was making a similar mistake to you. Thanks!

SamJoel · September 1, 2020, 3:45am

@bkj I don’t have a GitHub repo for this😅. I was just checking out and paying with fastai. But I used a normal resnet(not a xresnet) and used normal fastai lr find and trained it for a long time. I trained for 25 or more epochs using the LR find again and again after each 8 epochs, freezing and unfreezing alternatively. I got 93 acc for three or four epochs and it went up to 94.3 after a long time.

nkurCing · May 26, 2021, 1:34pm

I can still see that pre-trained weights are not available. I would like to train the model and share the weights (at least for the small ones).

Is there any standard script for training ? If not then I need to know the following details:

Number of epochs
Learning rate
Weight decay, etc.

Let me know if interested . . .