How to use xse_resnext50 and similar models?

deepgander · December 8, 2020, 1:00am

I think I don’t understand how to use xse_resnext50 and similar models. In the below code, resnet18(pretrained=True) worked just fine, but xse_resnext50(pretrained=True, n_out=1024) or xse_resnext50(pretrained=True, n_out=5) (I have 5 target classes, but I’m really just confused as to what the n_out parameter does)
learn = Learner(dls, xse_resnext50(pretrained=True, n_out=1024), loss_func=LabelSmoothingCrossEntropy())
learn.fit_flat_cos(15, lr=0.001)
failed. Once I changed this to n_out=5 (when I have five classes in a multi-class problem), this seemed to work.

I have the suspicion that I am just missing something very obvious here. Should n_out be the number of outputs my model has? But does my datablock not encode that already? I certainly don’t need to indicate such a parameter, if I e.g. use resnet18.

ilovescience · December 8, 2020, 4:33am

Can you please clarify what is the problem? Indeed, the number of classes should be encoded in the datablock, and we would expect that n_out=1024 wouldn’t work for a dataset with 5 classes.

By the way, most of the XResNet models (including this one) do not actually have pretrained weights yet.

deepgander · December 8, 2020, 8:08am

Thanks, the issue was indeed n_out=classes. I was really just puzzled why this particular class of models has this parameter, but your response and another one that someone else just pointed out to me (Xresnet Transfer Learning) confirm that you should indeed set n_out to the number of classes (the other thread suggested n_out=dls.c).

Thanks for highlighting that these models don’t have pretrained weights. I now tried to confirm which networks actually have pretrained weights and I could not figure out whether there’s an overview location (“model zoo”) for that. I followed the source code and found where the pretrained models get loaded from some Amazon cloud storage, but presumably I’m overlooking something obvious here, or am I not?