Is PixelShuffle initialized correctly?

jamesp · January 2, 2023, 7:41pm

Glancing at PixelShuffle, I noticed the following line:

if norm_type == NormType.Weight:
            layers[0][0].weight_v.data.copy_(icnr_init(layers[0][0].weight_v.data))
            layers[0][0].weight_g.data.copy_(((layers[0][0].weight_v.data**2).sum(dim=[1,2,3])**0.5)[:,None,None,None])

Just eyeballing it - that last line replacing .weight_g (the magnitude) with a transformation of .weight_v (the direction) seems a little odd. Is there any chance it should be using .weight_g instead? That is, I wonder if it should instead be:

if norm_type == NormType.Weight:
            layers[0][0].weight_v.data.copy_(icnr_init(layers[0][0].weight_v.data))
            layers[0][0].weight_g.data.copy_(((layers[0][0].weight_g.data**2).sum(dim=[1,2,3])**0.5)[:,None,None,None])

I don’t use weight_norm so this doesn’t affect me directly, which also means that I’m not sure this is a bug so didn’t want to pollute the GitHub Issues just yet—but figured I’d pose it here as a question.

jamesp · January 4, 2023, 6:22am

Staring at it a bit more, it can’t just copy its own weight, whereas under certain assumptions it can derive the magnitude from the direction vectors, so the way it’s setup in fastai seems correct.