Should I use different loss functions for things like object counting using linear regression compared to let’s say imagenet object categorization or approximating some random mathematical functions using linear regression?

Or is MSE just good enough for everything?

If not, what are the benefits, and which loss functions should I use in which situations?

So Cross-Entropy is to be used in classification problems. And MSE (and alike) in regression problems.

But what I’m still unsure has the use of log loss vs MSE for regression problems. L1 loss, L2, loss? When one should use each for a regression problem or is MSE just all good?

logloss is just for classification problems. It’s the same thing as cross-entropy depending on what source you’re looking at. L1 and L2 loss just mean that there is a penalty applied for the size of the parameters in your model. This happens automatically with fastai library if you specify weight decay.

MSE is not good for all regression problems. MSE is most appropriate for normally distributed response variables. if your response variable is skewed, you should think about how much you want to penalize under-predicting vs over-predicting. MSE will penalize the same, but other loss functions will not. It’s a tricky question because ultimately it depends on the purpose of your study.