Gradients for softmax are tiny [Solved]

Immediately after getting this , I got stuck at cross entropy + softmax.

My chain rule derivative vs analytical derivative seem to be different.

I have written a detailed question here https://math.stackexchange.com/questions/2843505/derivative-of-softmax-without-cross-entropy

Along with the corrected derivation of it along with the question itself.