I realize that fastai custom cnn head (create_head function) sets bias=False for the last layer. However, fastai V1 (1.0.62) and many other libraries set bias=True in this layer. I can get a similar behavior if I set bn_final to True but I think that it’s to scale network output to the right scale (page 461 in fastai book). So, I can’t understand why bias=False. For me, it seems like a bug. Someone has any idea of it?
I made a git blame on create_head and LinBnDrop and haven’t been touch from the initial commit.
@muellerzr, ¿Should I tag jeremy or sgugger so they can explain this?
Not sure whether this is a bug or not, but you could easily do it yourself:
def add_bias_last_lin(m):
for i in range(len(m))[::-1]:
curr = m[i]
# i < (len(m)-1) ensures we wouldn't try to access an index out of range
nxt = m[i+1] if (i < (len(m)-1)) else None
if isinstance(curr, nn.Linear):
# If bn_final=True, we don't need to add bias
if not isinstance(nxt, nn.BatchNorm1d):
m[i] = nn.Linear(curr.in_features, curr.out_features,
bias=True)
# Only needed for the last linear layer, so we exit
break
learn = cnn_learner(dls, arch)
# The last Sequential in the model is the head
add_bias_last_lin(learn.model[-1])
Checking the code of LinBnDrop?? there is the following line:
lin = [nn.Linear(n_in, n_out, bias=not bn)]
So it’s not a bug, it’s done because BatchNorm ‘learns’ the median (and variance) of the output of the layer (after normalizing it by the mean, variance of the activation output).
So when bn is on, no bias are included.