Is there a difference between a "Loss" vs "Cost" function

Is there a difference between a “Cost” function and a “Loss” function?

I am building out my own excel spreadsheet, trying to make sure I really get the concepts, and I am struggling a bit with terminology.

For example, I have seen Mean Squared Error referred to as a “Cost Function”.

But I as I have been studying the notebooks (lesson 2 and the SGD-intro), I see the Sum of Squared Errors being used as a “loss” function.

Maybe I am not even sure if MSE and SSE are the same (i was thinking they were one in the same? maybe a bad assumption)…

And advice would be highly welcomed. Thanks.

Hey @york,

in my experience there are no difference.

Costs function is a synonym for loss function. For “learning” / “optimizing” the model, you want to minimize the function.

In mathematical optimization, statistics, decision theory, machine learning and computational neuroscience, a loss function or cost function is a function that maps an event […]



Thank you @benediktschifferer,

That clears things up for me.

A cost function and a loss function are indeed the same thing. “Error”, in the sense of SSE and MSE, is the difference between the predicted value and the actual value. SSE is calculated by squaring each error, and then summing them. MSE is the sum of squared errors divided by the number of data points. Both of these are valid cost/loss functions.


Thanks for this explanation @munyari, it was very clear.

As I work through the course and try to understand the topics, one thing that confuses me a little is why we need the derivative of the cost function.

For example, if I increase a weight by a “little bit”, and the cost function output goes down, then can’t I confidently increase the weight by a “little bit” based on that cost function alone? I am trying to understand what the derivative tells me with respect to the output of the cost function.

I know it has been explained in a lesson somewhere, and I will be reviewing it again.

Thanks again.

Loss is for a single training sample. Cost is for the entire training set i.e. the average across the training sample