Can you please explain the goal of using batch normalization after relu.
Relu will introduce sparsity and using normalization over it – isn’t this loss of information ?
Can you please explain the goal of using batch normalization after relu.
Relu will introduce sparsity and using normalization over it – isn’t this loss of information ?