Error with custom CNN, Exception: No weight layer

Hi all,

I’m still very much a beginner and working through the first part of the fastai course. Apologies in advance if this is very basic or is covered in a later class.

I wanted to create the simplest CNN I could for image recognition and then incrementally add layers to see what happened to the accuracy.

I copied the alexnet.py code from torchvision/models and edited it to be a two-layer CNN. This is the main body of the file:

class MoonNet2L(nn.Module):

def __init__(self, num_classes=2):
    super(MoonNet2L, self).__init__()
    self.features = nn.Sequential(
        nn.Conv2d(3, 32, kernel_size=5, stride=1, padding=2),
        nn.BatchNorm2d(32),
        nn.ReLU(inplace=True),
        nn.MaxPool2d(kernel_size=2, stride=2),
        nn.Conv2d(32, 64, kernel_size=5, stride=1, padding=2),
        nn.BatchNorm2d(64),
        nn.ReLU(inplace=True),
        nn.MaxPool2d(kernel_size=2, stride=2),
    )

#self.avgpool = nn.AdaptiveAvgPool2d((56,56))

    self.classifier = nn.Sequential(
        nn.Dropout(0.5),
        nn.Linear(56 * 56 * 64, num_classes),
    )
def forward(self, x):
    x = self.features(x)
    x = torch.flatten(x,1)
    x = self.classifier(x)
    return x

However, when I try to run:

learn = cnn_learner(data, models.moonnet2l, metrics=accuracy)

I get the following error:

~/anaconda3/envs/fc_fastai/lib/python3.7/site-packages/fastai/torch_core.py in in_channels(m)
261 for l in flatten_model(m):
262 if hasattr(l, ‘weight’): return l.weight.shape[1]
–> 263 raise Exception(‘No weight layer’)
264
265 class ModelOnCPU():

Exception: No weight layer

The only way that I can get this error to go away is by using AdaptiveAvgPool2d (i.e. if I uncomment the above line then it’s ok). Why is this the case?

Furthermore, if I do that and run a quick learn.summary() afterwards, I get:

Sequential

Layer (type) Output Shape Param # Trainable

Conv2d [32, 224, 224] 2,432 False


BatchNorm2d [32, 224, 224] 64 True


ReLU [32, 224, 224] 0 False


MaxPool2d [32, 112, 112] 0 False


Conv2d [64, 112, 112] 51,264 False


BatchNorm2d [64, 112, 112] 128 True


ReLU [64, 112, 112] 0 False


MaxPool2d [64, 56, 56] 0 False


AdaptiveAvgPool2d [64, 1, 1] 0 False


AdaptiveMaxPool2d [64, 1, 1] 0 False


Flatten [128] 0 False


BatchNorm1d [128] 256 True


Dropout [128] 0 False


Linear [512] 66,048 True


ReLU [512] 0 False


BatchNorm1d [512] 1,024 True


Dropout [512] 0 False


Linear [2] 1,026 True

Where did all those layers between flatten and my last Fully Connected Layer come from?

Sorry again if this is a very basic question, and thanks in advance!

You’re using cnn_learner, which does its own modifications to the base architecture you use. If you want to use your approach, you should use Learner (look at the MNIST notebook under the course, it discusses building an NN from scratch, like you’re partially doing) :slight_smile:

1 Like

Thank you so much! That was super helpful and exactly what I was looking for.