Why is the groups variable being overwritten after the for loop (seems like an error?).
I feel like something is wrong with this function b/c the groups variable is being overwritten, however, I’m confused by what this function is supposed to accomplish to begin with. Any insight or help would be much appreciated.
From my understanding what they are trying to do is to apply different learning rate to different parts of the models. Each part might have one or more layers. These different parts are the groups
groups=[part1,part2,part3,…]
each part is a sequential object
for each part they want to apply a different learning rate.
1)The output is list of list of parameters.
output=[part.parameters() for part in groups)
so output is basically list of parameter groups.
so when applying different learning rates we loop over this output and for all the parameters in each part we apply the same learning rate.
2)Groups variable being overwritten seems like an error to me as well. That line should be an append to the existing groups variable.
The great thing about note books is that you can delve into the workings easily by splitting and adding cells to enable discovery of inputs and outputs.
I concur with your bug description :-
def lm_splitter(m):
groups = []
for i in range(len(m[0].rnns)): groups.append(nn.Sequential(m[0].rnns[i], m[0].hidden_dps[i])) #groups = [nn.Sequential(m[0].encoder, m[0], m[0].input_dp, m[1])] groups.append([nn.Sequential(m[0].encoder, m[0], m[0].input_dp, m[1])])
return [list(o.parameters()) for o in groups]