Glancing at PixelShuffle, I noticed the following line:
if norm_type == NormType.Weight:
layers[0][0].weight_v.data.copy_(icnr_init(layers[0][0].weight_v.data))
layers[0][0].weight_g.data.copy_(((layers[0][0].weight_v.data**2).sum(dim=[1,2,3])**0.5)[:,None,None,None])
Just eyeballing it - that last line replacing .weight_g
(the magnitude) with a transformation of .weight_v
(the direction) seems a little odd. Is there any chance it should be using .weight_g
instead? That is, I wonder if it should instead be:
if norm_type == NormType.Weight:
layers[0][0].weight_v.data.copy_(icnr_init(layers[0][0].weight_v.data))
layers[0][0].weight_g.data.copy_(((layers[0][0].weight_g.data**2).sum(dim=[1,2,3])**0.5)[:,None,None,None])
I don’t use weight_norm so this doesn’t affect me directly, which also means that I’m not sure this is a bug so didn’t want to pollute the GitHub Issues just yet—but figured I’d pose it here as a question.