Questions about batch normalization

Well spotted. When I implemented this I did some searching and more recent advice seems to be to put it after the non-linearity, based on some experiments.

6 Likes