Hello guys! I was revisiting some references about batch norm, and there was a statement in the Lesson 3 Wiki that made me confused:
- Make all four of these values (the mean and standard deviation for normalization, and the two new parameters to set arbitrary mean and standard deviations after the normalization) trainable .
From my understanding, only the two new parameters are trainable (gamma and beta), not all four as stated (which includes mean and std). Or am I missing something?