Create_head sets bias=False for the last Linear layer in cnn models


I realize that fastai custom cnn head (create_head function) sets bias=False for the last layer. However, fastai V1 (1.0.62) and many other libraries set bias=True in this layer. I can get a similar behavior if I set bn_final to True but I think that it’s to scale network output to the right scale (page 461 in fastai book). So, I can’t understand why bias=False. For me, it seems like a bug. Someone has any idea of it?

I made a git blame on create_head and LinBnDrop and haven’t been touch from the initial commit.

@muellerzr, ¿Should I tag jeremy or sgugger so they can explain this?

1 Like

I have the same problem. I need to change the bias=true for the last layer of rosnet in cnn-learner.

learn = cnn_learner(data,models.resnet34,metrics=error_rate)

how did you solve this?

Not sure whether this is a bug or not, but you could easily do it yourself:

def add_bias_last_lin(m):
    for i in range(len(m))[::-1]:
        curr = m[i]
        # i < (len(m)-1) ensures we wouldn't try to access an index out of range
        nxt = m[i+1] if (i < (len(m)-1)) else None
        if isinstance(curr, nn.Linear):
            # If bn_final=True, we don't need to add bias
            if not isinstance(nxt, nn.BatchNorm1d):
                m[i] = nn.Linear(curr.in_features, curr.out_features,
            # Only needed for the last linear layer, so we exit

learn = cnn_learner(dls, arch)
# The last Sequential in the model is the head

Checking the code of LinBnDrop?? there is the following line:

lin = [nn.Linear(n_in, n_out, bias=not bn)]

So it’s not a bug, it’s done because BatchNorm ‘learns’ the median (and variance) of the output of the layer (after normalizing it by the mean, variance of the activation output).
So when bn is on, no bias are included.

Why do you think you need the bias?