I was wondering if anyone could help me answer this question from the end of chapter questionnaire: What would happen if we used cross-entropy loss with MovieLens? How would we need to change the model?

I’ve tried to think about it for a while but can’t get my head around it.

I’m guessing the loss function used to predict the ratings. We are trying to solve a regression problem that is predicting a continuous number. Cross entropy loss is usually used to find a difference between two probability distributions. We use the mse loss here to judge how far we are from where we need to be. You could also use absolute error (L1 loss).

I don’t intend to promote my work, you can check out my article on Cross entropy loss to get somewhat of an intuition.