Is it really mean squared error?

Hello. In the 4th notebook about MNIST basics, there is a line of code that defines a loss function before going through the end-to-end stochastic gradient descent example. That line is:

``````def mse(preds, targets):
return ((preds-targets)**2).mean().sqrt()
``````

The name of the function is chosen as `mse`, but the formula is clearly indicating that the loss function is calculating the square root of MSE! Wouldn’t it be more correct if either the name of the function were RMSE (corresponding to root mean squared error), or the `.sqrt()` part were removed?

I am a little bit confused…

I’d agree with you that the loss here is actually `rmse`, not `mse`. However, it also really doesn’t matter very much. (The MSE “punishes” wrong values more than the RMSE.)

Also, in Step 3, the initial loss value is 25823.8086, wich is clearly not something that you get after calcluating the square root of the sum of the squared errors.

``````loss = mse(preds, speed)
loss
Out: tensor(25823.8086, grad_fn=<MeanBackward0>)
``````

Instead, I assume that this is the value that you get after using MSE method, but not RMSE… This is so convoluted and not beginner-friendly…

I don’t know exactly what notebook this is, but note that the `grad_fn` says `MeanBackward0`. That makes me believe the `mean()` operation was the last thing executed. Otherwise it would have said something like `SqrtBackward`. (I may be wrong about this, but it does look like the `sqrt` was never used.)