I realize that fastai custom cnn head (create_head function) sets bias=False for the last layer. However, fastai V1 (1.0.62) and many other libraries set bias=True in this layer. I can get a similar behavior if I set
bn_final to True but I think that it’s to scale network output to the right scale (page 461 in fastai book). So, I can’t understand why bias=False. For me, it seems like a bug. Someone has any idea of it?
I made a git blame on
LinBnDrop and haven’t been touch from the initial commit.
@muellerzr, ¿Should I tag jeremy or sgugger so they can explain this?
I have the same problem. I need to change the bias=true for the last layer of rosnet in cnn-learner.
learn = cnn_learner(data,models.resnet34,metrics=error_rate)
how did you solve this?
Not sure whether this is a bug or not, but you could easily do it yourself:
for i in range(len(m))[::-1]:
curr = m[i]
# i < (len(m)-1) ensures we wouldn't try to access an index out of range
nxt = m[i+1] if (i < (len(m)-1)) else None
if isinstance(curr, nn.Linear):
# If bn_final=True, we don't need to add bias
if not isinstance(nxt, nn.BatchNorm1d):
m[i] = nn.Linear(curr.in_features, curr.out_features,
# Only needed for the last linear layer, so we exit
learn = cnn_learner(dls, arch)
# The last Sequential in the model is the head
Checking the code of
LinBnDrop?? there is the following line:
lin = [nn.Linear(n_in, n_out, bias=not bn)]
So it’s not a bug, it’s done because BatchNorm ‘learns’ the median (and variance) of the output of the layer (after normalizing it by the mean, variance of the activation output).
So when bn is on, no bias are included.
Why do you think you need the bias?