A cost function and a loss function are indeed the same thing. “Error”, in the sense of SSE and MSE, is the difference between the predicted value and the actual value. SSE is calculated by squaring each error, and then summing them. MSE is the sum of squared errors divided by the number of data points. Both of these are valid cost/loss functions.
Thanks for this explanation @munyari, it was very clear.
As I work through the course and try to understand the topics, one thing that confuses me a little is why we need the derivative of the cost function.
For example, if I increase a weight by a “little bit”, and the cost function output goes down, then can’t I confidently increase the weight by a “little bit” based on that cost function alone? I am trying to understand what the derivative tells me with respect to the output of the cost function.
I know it has been explained in a lesson somewhere, and I will be reviewing it again.