Collaborative filtering predict/inference (part 1 2020)

duchaba · July 28, 2021, 7:22am

After you trained the model in Google Colaboratory (and in chapter 7 in the fastbook)

How do you predict the movie’s rating for movies that the users have not seen? For example, user-id 242 has watched/rated 20 movies, so how do I predict the rating for user-242 on the other movies?

In other words, I want to see all the ratings for 1,682 movies for user-242. cc: @jeremy @sgugger

Does anyone have a solution or even a suggestion on how to do the above?

dhruv.metha · August 2, 2021, 8:39pm

Hey @duchaba

The forward pass on the model (dot product between the user and movie latent factors) would give us the predicted rating. I quickly patched up a get_ratings function for use or a suggestion towards inference.

You can pass in any number of movies for a single user as shown below or vice versa.
It would also work for equal lengths of movies and users.

def get_ratings(learn, users=tensor([]), items=tensor([])):
  with torch.no_grad():
    if not isinstance(items, torch.Tensor): items = tensor(items).view(-1)
    if not isinstance(users, torch.Tensor): users = tensor(users).view(-1)
    if len(items) == 0: 
      items = torch.arange(learn.model.i_weight.num_embeddings)
    if len(users) == 0: 
      users = torch.arange(learn.model.u_weight.num_embeddings)
    
    try:
      dot = learn.u_weight(users)* learn.i_weight(items)
      res = dot.sum(1) + learn.u_bias(users).squeeze() + learn.i_bias(items).squeeze()
      return torch.sigmoid(res) * (learn.y_range[1]-learn.y_range[0]) + learn.y_range[0]
    except:
      print('The user/item index may be invalid')

get_ratings(learn, users=242, items=[10, 20, 3])

There may be some broadcasting tricks that I’m missing here as this does not work for multi-user multi-items when they have different lengths. For example,

get_ratings(learn, users=[242, 10], items=[10, 20, 3]) #this would not work

Hope this helps!

dhruv.metha · August 2, 2021, 9:21pm

Hey @duchaba

An edit of my earlier reply.

There may be some broadcasting tricks that I’m missing here as this does not work for multi-user multi-items when they have different lengths.

The following function would return ratings indexed by user for any number of users or items given in the function.

def get_ratings(learn, users=tensor([]), items=tensor([])):
  with torch.no_grad():
    if not isinstance(items, torch.Tensor): items = tensor(items).view(-1)
    if not isinstance(users, torch.Tensor): users = tensor(users).view(-1)
    if len(items) == 0: 
      items = torch.arange(learn.model.i_weight.num_embeddings)
    if len(users) == 0: 
      users = torch.arange(learn.model.u_weight.num_embeddings)
    try:
      dot = learn.u_weight(users).unsqueeze(1) * learn.i_weight(items)
      res = dot.sum(-1) + learn.u_bias(users) + learn.i_bias(items).squeeze()
      return torch.sigmoid(res) * (learn.y_range[1]-learn.y_range[0]) + learn.y_range[0]
    except:
      print('The user/item index may be invalid')
      
get_ratings(learn, users=[242, 500], items=[111, 102, 105])

Now this would work:
get_ratings(learn, users=[242, 10], items=[10, 20, 3])

duchaba · August 3, 2021, 4:05am

@dhruv.metha :

Thanks for the insight. I think I follow it, but I am not quite sure about the parameters. users=[242, 10], where “242” is the user-ID, what is the “10” represent? is “10” another user-id?

What are the three parameters for items=[10,20,3] represent? is items == list of movie-id?

And ran into error of device CPU and Cuda/GPU (on Google colab)
So I added these:
items = items.to(device = ‘cuda’)
users = users.to(device = ‘cuda’)

Regards,
Duc Haba

dhruv.metha · August 3, 2021, 5:00am

Yeah @duchaba

This is the list of user-ids

This is the list of item ids (movie ids)

Yes, I missed that! It’s GPU ready now!

What I mean when the output is indexed by users is that output[0] is the rating that first user in the list would give for each of the items (movies) given.

For example, in our case

output[0] would be user-id 242 's ratings for the 3 items (movies) [10, 20, 3]

Why we say items instead of movies?
In general the recommendation can be on anything. For example, movies, songs, games etc. Hence we use items as a more general term.

duchaba · August 3, 2021, 6:42am

Big Thanks!