Anyone know the difference between these two methods:
unfreeze(self)
-
bn_freeze(self, do_freeze)
, when do_freeze
is set to False
Any guesses as to what the bn stands for?
It’s pretty clear from the code below that unfreeze
sets the children
in the model to trainable
:
def freeze_to(self, n):
c=self.get_layer_groups()
for l in c: set_trainable(l, False)
for l in c[n:]: set_trainable(l, True)
def unfreeze(self): self.freeze_to(0)
But what are children
? It looks like children
is a Torch concept that represents the layers in the model. Is this correct? Do all the children represent all the layers? If this is the case then I think calling unfreeze
makes it so that all of the weights for all of the layers are updated when we do our training during fit
.
It’s a little less clear what bn_freeze(False)
would do because you have to understand what apply_leaf
does and how the model’s bn_freeze
attribute is used, as you can see below:
def set_bn_freeze(self, m, do_freeze):
if hasattr(m, 'running_mean'): m.bn_freeze = do_freeze
def bn_freeze(self, do_freeze):
apply_leaf(self.model, lambda m: self.set_bn_freeze(m, do_freeze))
I looks like apply_leaf
recursively applies the above lambda function, which sets the model’s bn_freeze
attribute to False
(for our example), to all the children
in the model. See apply_leaf
code here:
def apply_leaf(m, f):
c = children(m)
if isinstance(m, nn.Module): f(m)
if len(c)>0:
for l in c: apply_leaf(l,f)
And it looks like the model’s bn_freeze
attribute is only used in one place in the fastAi code, inside model.py
:
def set_train_mode(m):
if (hasattr(m, 'running_mean') and
(getattr(m,'bn_freeze',False) or not getattr(m,'trainable',False))): m.eval()
else: m.train()
The set_train_mode
function gets called when we do our training during fit
. Looks like if bn_freeze
is False then train
is called, so long as trainable
is also set to True, which I think it is.
So this looks similar to unfreeze
in that it trains all the children
when bn_freeze
is set to false. I’m sure I’m missing something, since why would we have two different ways of doing the same thing. My guess is that trainable
being set to True must mean something different than bn_freeze
being set to False.