Using LSUV for the ImageNette (09) notebooks - how to integrate?

I wanted to integrate the LSUV init for the cnn’s in the 09 notebook so I could see how they improve the results.
However, I’m only able to get about halfway in terms of integrating this and wondering if someone else has done it and share their code?

Here’s what I did so far - 09b - basically learner is all set, so I then hook into the Conv2d modules:

learn = get_learner(nfs, data, 0.4, conv_layer, cb_funcs=cbfs)

#get the relevant modules
mods = find_modules(learn.model, lambda o: isinstance(o,nn.Conv2d))

mods
[Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False),
Conv2d(16, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False),
Conv2d(32, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False),
Conv2d(64, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False),
Conv2d(32, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False),
Conv2d(32, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False),
Conv2d(32, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)

So, I’ve got the relevant modules. However, I’m bogged down then trying to figure out how to get the one batch in the new learner setup and hook into the learner to adjust the weights.

def append_stat(hook, mod, inp, outp):
d = outp.data
hook.mean,hook.std = d.mean().item(),d.std().item()

with Hooks(mods, append_stat) as hooks:
mdl(xb)
for hook in hooks: print(hook.mean,hook.std)

and:
def lsuv_module(m, xb):
h = Hook(m, append_stat)

while mdl(xb) is not None and abs(h.mean)  > 1e-3: m.bias -= h.mean
while mdl(xb) is not None and abs(h.std-1) > 1e-3: m.weight.data /= h.std

h.remove()
return h.mean,h.std

The lsuv_module is the only one exported for re-use, which makes me think this should be more integrated into the framework rather than bringing along the first two helper functions?

Otherwise, how do I get a single batch for the xb variable to try and then hook in?

I’m continuing to work through it, but if anyone has tips/hints or even working code example, I would greatly appreciate it!

how do I get a single batch for the xb variable to try and then hook in?

xb,yb = next(iter(data.train_dl))

1 Like

Thank you - much appreciated!