Thanks for sharing the paper.
You may like thread here. It discusses a toy dataset that’s easily fitted using a tanh activation as opposed to the conventional choice of a relu.
Thanks for sharing the paper.
You may like thread here. It discusses a toy dataset that’s easily fitted using a tanh activation as opposed to the conventional choice of a relu.