Jeremy says in lesson 5 that we square the numbers and divide them by the total to get the confidence via softmax. Why not just use absolute value instead of squaring the numbers? Is it computationally more expensive?
Jeremy says in lesson 5 that we square the numbers and divide them by the total to get the confidence via softmax. Why not just use absolute value instead of squaring the numbers? Is it computationally more expensive?