In softmax, why is there a shuffle?

The order of operations shouldn’t really matter, so why shuffle?

I found this code in a tutorial!

def softmax(X):
    e_x = T.exp(X - X.max(axis=1).dimshuffle(0, 'x'))
    return e_x / e_x.sum(axis=1).dimshuffle(0, 'x')

this seems to be a function from Theano, which has reached its end of life. im not sure if theres a reason to still use it. anyway, dimshuffle appears to be a function that switches around the order of multi dimensional arrays. this has nothing to do with randomly shuffling the entries of an array.

Thanks for the explanation of the shuffling. I’m just using theano because it’s part of lesson 6. I guess I could be doing the same exercise in tensorflow or torch or any number of things. I don’t think it hurts to follow along the course though.