It looks like there is no consensus yet on where to best place the BatchNorm layer. There is another thread about it here: Questions about batch normalization
It looks like there is no consensus yet on where to best place the BatchNorm layer. There is another thread about it here: Questions about batch normalization