MovieLens with CrossEntropyLoss

In lesson 8 (collaborative filtering) further research we were asked this question:

Create a model for MovieLens which works with CrossEntropy loss, and compare it to the model in this chapter.

As expected, this does not work just by changing the loss function, is the idea here to predict an integer between 0 and 5? What about the .5 ratings? Does this make any sense? Can I get any tips on how to do that? I imagine I have to change the DataLoaders so the y parameter is a tensor of 5(probabilities). And also a change in the forward function inside the DotProduct module.


I believe the high-level idea here is to treat this problem as a classification problem, so ignoring treating the rating as non-ordinal and non-continuous. Then this problem becomes multi-class classification, and the results of your model would go into a softmax layer with X outputs where X is the number of classes. You can then calculate accuracy and see how well the model performs in this classification problem.

You should try to implement this yourself as I believe it’s a nice opportunity to fiddle with things manually and see what happens, let me know if you need any help :slight_smile:

Hi orendar, thanks for your response!
So what classes would you use here? I didn’t quite understand what you meant here:

Also the only model creation I’ve seen are the mnist(Lesson 4) and the one on the lesson 8, so I have some doubts… To change the layer output I have to edit the forward return value, right?So in this case should be a tensor of size [#classes, 1]? Another question I have is whether i need to apply the softmax myself or not, since the CELoss already applies it.

Am I right?


Sorry, I meant “ignoring the order and treating the rating as non-ordinal and non-continuous.” So instead of predicting a continuous number between 1 and 5, you are predicting a class (for example here we could have 9 classes: 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5). You would need to treat the ratings as class labels, and therefore the task would now be multiclass classification.

Regarding the loss - it depends on the loss function you use, as Jeremy explains in the lesson. If you use a loss function which already applies the softmax, then you just need 10 outputs from a fully-connected layer, otherwise you also need to add a softmax layer yourself.

Hello again,
so I’ve been working on it and I think I’ve got something but I can’t find a way to make my dataloaders y.shape of [64, 9]. The shape of my dls is:

dls = CollabDataLoaders.from_df(ratings, item_name=‘title’, bs=64)
x,y = dls.one_batch()
(torch.Size([64, 2]), torch.Size([64, 1]))

What I did manage to do was to make my forward function return a tensor of size [64,9], meaning batch size and number of classes.

Running the trainer I get the following error

model = DotProductBias(n_users, n_movies, 50)
learn = Learner(dls, model, loss_func=CrossEntropyLossFlat)
learn.fit_one_cycle(5, 5e-3)

RuntimeError: Boolean value of Tensor with more than one value is ambiguous

I will share my notebook in case it helps:

PD: I don’t really know if what I have done is right or any useful but it should work, I think.

@veci did you try using neural nets instead of the dotproductbias? I tried crossentropyloss withh NNs however the results weren’t good. 40-50% accuracy only