Lesson 5 - recommender system vs. machine learning model

(Giorgio) #1

Is there a theoretical reason for why we should have better chances at correctly imputing missing values for some (user_id, movie_id) combinations with matrix factorization and similar techniques vs. just building a regression model with ratings as target and (user_id, movie_id, other_features) as right hand side variables?

The regression-based approach would unlock the opportunity of using as regressors explicitly (user’s age, income, etc., movie genre, budget, etc.) and implicitly (user and movie embeddings). I guess that would be awkward to do in the context of matrix factorization.

(James Leslie) #2

@gballardin We cannot use user_id, movie_id as inputs to a regression model, as a higher ID does not necessarily correspond to more user-ness or anything like that. Linear regression assumes a continuous relationship between the predictors and target, so as we increase the value of one predictor, then it has a predictable impact on the target. This is not possible when our predictor variables are IDs which have no ordinality. Using the embedding layer allows us to represent each user as a vector of latent factors which capture their viewing tendencies.

(Giorgio) #3

Correct, I was suggesting to use user_id and movie_id embeddings as regressors and not the ids themselves. We’re on the same page on that.

I hinted at using regression, and not specifically linear regression, which still has the ability to model non-linear discontinuities. Think of using a tree-based regression, for example.

(James Leslie) #4

Ok cool, I’m with you. But how would we learn those embeddings if not using a NN?

(Giorgio) #5

You would in fact use a NN to calculated the embeddings in the first step. As a second step you’d use your ML algo of choice to predict ratings, with embeddings from the first step as features. That is the approach I am suggesting in alternative to matrix factorization.