Loss functions - what to use and when?


Should I use different loss functions for things like object counting using linear regression compared to let’s say imagenet object categorization or approximating some random mathematical functions using linear regression?

Or is MSE just good enough for everything?

If not, what are the benefits, and which loss functions should I use in which situations?

Thanks in advance for all you helpful people! :slight_smile:

… continuing.

So Cross-Entropy is to be used in classification problems. And MSE (and alike) in regression problems.

But what I’m still unsure has the use of log loss vs MSE for regression problems. L1 loss, L2, loss? When one should use each for a regression problem or is MSE just all good?

logloss is just for classification problems. It’s the same thing as cross-entropy depending on what source you’re looking at. L1 and L2 loss just mean that there is a penalty applied for the size of the parameters in your model. This happens automatically with fastai library if you specify weight decay.

MSE is not good for all regression problems. MSE is most appropriate for normally distributed response variables. if your response variable is skewed, you should think about how much you want to penalize under-predicting vs over-predicting. MSE will penalize the same, but other loss functions will not. It’s a tricky question because ultimately it depends on the purpose of your study.

1 Like

Thanks for this helpful response Patrick. If one wanted to penalize the model more if it overpredicts, what would you recommend as a loss function?

1 Like

Hi cinbez.

You would want to look at quantile regression loss functions for that type of problem.