[Solved] Error while creating learner for senet : TypeError: conv2d(): argument 'input' (position 1) must be Tensor, not bool

I am trying to create a cnn learner using create_cnn with arch as se_resnet50 and picking the weights from https://github.com/Cadene/pretrained-models.pytorch

Below are the screenshots of error:


Hey,

the function create_cnn expects a callable function. One solution would be:

model_name = 'se_resnet50'
def get_cadene_model(pretrained=True, model_name='se_resnet50'):
    if pretrained:
        arch = pretrainedmodels.__dict__[model_name](num_classes=1000, pretrained='imagenet')
    else:
        arch = pretrainedmodels.__dict__[model_name](num_classes=1000, pretrained=None)
    return arch

learn = create_cnn(data, get_cadene_model, metrics=[error_rate], wd=0.01, cut=-2)

With kind regards,
Christian

5 Likes

Thank you very much for sharing this.

I tried something similar without changing the head, since I think the create_cnn will do it for us.

However I got:

RuntimeError: Given input size: (2048x2x2). Calculated output size: (2048x-4x-4). Output size is too small at /opt/conda/conda-bld/pytorch_1544202130060/work/aten/src/THNN/generic/SpatialAveragePooling.c:48

Any idea?
I tried changing the head as well like what is shown by @jainds, but still the same error.
@jainds Did it work for you?

Please refer to my code here. It is working now.
https://forums.fast.ai/t/lesson-5-advanced-discussion/30865/40?u=hwasiti

I think cut=-2 as parameter inside create_cnn should get your code running. I do not have access to the link you shared.

The link does not work for you bec., the version 3 of the fastai course and its related forum lessons will be public soon.

If you need my working code in the link above I can DM it to you.

Thanks! I got it working. Here is how I did it:

model_name = ‘se_resnet50’
def get_cadene_model(pretrained=True, model_name=‘se_resnet50’):
if pretrained:
arch = pretrainedmodels.dict[model_name](num_classes=1000, pretrained=‘imagenet’)
else:
arch = pretrainedmodels.dict[model_name](num_classes=1000, pretrained=None)
return arch
learn = create_cnn(data, partial(get_cadene_model,model_name=‘se_resnext50_32x4d’), cut=-2
)

1 Like

Good that it worked for you like that. In my troubleshooting, I hardcoded the model_name in get_cadene_model function, and still did not work. I am surprised that it worked that way.

What is your fastai version? mine is v1.0.39

So here is my code that worked eventually:

%reload_ext autoreload
%autoreload 2
%matplotlib inline

from fastai import *
from fastai.vision import *
import pretrainedmodels

path = untar_data(URLs.PETS); path
path_anno = path/'annotations'
path_img = path/'images'
fnames = get_image_files(path_img)
np.random.seed(2)
pat = re.compile(r'/([^/]+)_\d+.jpg$')

bs = 64

data = ImageDataBunch.from_name_re(path_img, fnames, pat, ds_tfms=get_transforms(),
                                   size=299, bs=bs//2).normalize(imagenet_stats)

def get_model(pretrained=True, model_name = 'resnet50', **kwargs ): 
    if pretrained:
        arch = pretrainedmodels.__dict__[model_name](num_classes=1000, pretrained='imagenet')
    else:
        arch = pretrainedmodels.__dict__[model_name](num_classes=1000, pretrained=None)
    return arch

custom_head = create_head(nf=2048*2, nc=37, ps=0.5, bn_final=False) 

# Below you can change the imported model into any of the models available in the `pretrainedmodels` 
# which can be shown by: pretrainedmodels.model_names
fastai_resnet50=nn.Sequential(*list(children(get_model(model_name = 'resnet50'))[:-2]),custom_head) 

def get_fastai_model(pretrained=True, **kwargs ): 
    return fastai_resnet50

learn = create_cnn(data, get_fastai_model, metrics=error_rate)
learn.fit_one_cycle(5) 
learn.unfreeze()
learn.fit_one_cycle(1, max_lr=slice(1e-6,1e-4)) 

Edit:
Maybe it is safer to create the custom_head as the following:

custom_head = create_head(nf=num_features_model(self.cnn)*2, nc=37, ps=0.5, bn_final=False) 

I haven’t test it yet, but just in case the output features of the body of a model that you will use is not 2048

Update :
Jeremy tweeted an even better implementation for importing architectures from cadene. What I missed in the above code, the points of split for the discriminative learning.

I updated the library to v1.0.40

Hi, are you able to create learners from pretrainedmodels such as alexnet or vgg16 ? It seems I’m only able to use a subset of models successfully.

Edit: I got the cadene alexnet working, but it seems a little clunky. Is there a cleaner way to do this?

data = ImageDataBunch.from_name_re(path_img, fnames, pat, ds_tfms=get_transforms(), size=224, bs=64
                                  ).normalize(imagenet_stats)

arch = pretrainedmodels.__dict__['resnet18']()

net = nn.Sequential(OrderedDict([
    ('features', arch._features),
    ('classifier', nn.Sequential(Flatten(), *children(arch)[1:]))
]))

learn = create_cnn(data, lambda *args : net, metrics=error_rate, custom_head=net.classifier)

print(learn.summary())

Hi

I haven’t tried those models…

It is better to use this github instead of mine. I have edited my above post with the following:
Update :
Jeremy tweeted an even better implementation for importing architectures from cadene. What I missed in the above code, the points of split for the discriminative learning.

1 Like

@hwasiti Thanks for that link, it really helped! One last thing I can’t figure out is that the learn.summary() for my models seems to be incorrect, even though the learn.model looks right. The summary shows the last linear layer as 1000 classes even though I changed it. If you happen to have come across this please let me know why this is the case. Using helper functions from that notebook:

pretrained='imagenet'

def alexnet_cadene(*args):
    model = pretrainedmodels.__dict__['alexnet'](pretrained=pretrained)
    sz = pretrainedmodels.pretrained_settings['alexnet']['imagenet']['input_size'][-1]
    data.sz = data.one_batch()[0].size()[-1]
    if data.sz != sz:
        raise ValueError(f'data size should be {sz} but is instead {data.sz}')    
    model.last_linear.out_features = data.c
    all_layers = list(model.children())
    model = nn.Sequential(all_layers[0], nn.Sequential(Flatten(), *all_layers[1:]))    
    return model

arch_summary(lambda _: alexnet_cadene()) # overall
arch_summary(lambda _: next(alexnet_cadene().children())) # body
arch_summary(lambda _: list(alexnet_cadene().children())[1]) # head

learn = create_cnn(data,alexnet_cadene,custom_head=children(alexnet_cadene())[1],metrics=error_rate,
                  split_on= lambda m: (m[0][0][6],m[1],m[1][7]))

get_groups(nn.Sequential(*learn.model[0][0], *learn.model[1]), learn.layer_groups)

print(learn.layer_groups)

print(learn.model) 
print(learn.summary()) # why is the model summary wrong?
# last linear layer says 1000 classes when my dataset has 37

I haven’t tried that github repo, but I noticed in your code that you haven’t change the output of the head. You are taking the head of the pretrained model as it is. The pretrained models are trained on imageNet which has 1,000 classes. You should specify your model’s head by yourself.

See my code, this line in particular:
custom_head = create_head(nf=2048*2, nc=37, ps=0.5, bn_final=False)

I think you should do something similar…

1 Like

Hi, thanks for your reply. I’m changing the number of output classes for the last linear layer in the line, model.last_linear.out_features = data.c. printing learn.model shows that the last layer has an output size of 37. So I’ve just taken the standard alexnet head, changed the number of outputs for the last layer and then passed that as a custom head. I made a new post which shows the printout information.

Edit: It turns out that the issue was related to what you pointed out. I thought I was editing the output of the last linear layer, but the way I did it was apparently not a proper way of doing so, and therefore the number of outputs when running learn.predict was actually 1000 even though learn.model said the output size was 37. So whereas I thought learn.summary was wrong, it was actually learn.model that was wrong.

1 Like