Where should I place the batch normalization layer(s)?

Can you please explain the goal of using batch normalization after relu.
Relu will introduce sparsity and using normalization over it – isn’t this loss of information ?