Is this NN initialization okay?


#1

The CNN I’m playing with looks just like the dawn bench cifar-10 leader, but with tanh output activation, to become a binary classifier.

No training has happened yet. Out of curiosity, I checked what happens if I provide random noise as input (mean=0, std=0.1). Now I am concerned by the fact that, depending on random weights the NN gets during the creation, the average output can be quite far from zero. In fact, with (mean=0, std=1.0) it’s often saturated.

Is it a concern or just something that training undoes quickly?

Here is the short notebook, for completeness.