At the end of Lesson 5 we learned that collaborative filtering can approximate the results of a user - movie rating database with astonishingly low error. But this seems to be a dead end – all we’ve really done is reproduce the results of a giant movie rating survey – we haven’t gained any new knowledge.
For our result to be practical, we would want to be able to predict how a new user who is not in our database would rate a new movie that is not in our database.
How could we do this? If the user and movie embeddings were based on a set of features, we could compare their embeddings to their nearest neighbors in the training set and use some sort of averaging to predict how any new user would rank any new movie. But there aren’t enough user and movie features.
So I am a bit perplexed here as to how these results can be used practically. Can anyone weigh in with some additional insight?
Just for fun, I modified the EmbeddingNet class to incorporate an embedding for movie genres. As Jeremy predicted, this didn’t improve the score.
I think the really amazing takeaway from this part of Lesson 5 is the simplicity of the matrix factorization technique that allows us to decompose a huge MxN rectangular matrix into the product of an MxJ matrix and a JxN matrix, for arbitrary integer J! This means solving for M*N*J^2 weights given only M*N datapoints!