Hello all,
I would like to ask you about n_factors when I build collaborative filtering model.
Can I choose random number for n_factors? In the lesson, he used 5 or 50 n_factors.
Also, would you explain what n_factors’ role is?
n_factors is just the size of the embeddings for each element. In the tutorial images, there are say x rows and each row has 5 features. These features are n_factors. They are what determine the features of a movie.
The more the features, the more accurate your model, but also harder to train. you should experiment with different n_factors and see what results you get.
The ideal number of embeddings depends on the complexity of the data. Using too many can lead to overfitting, while using too few may not be enough to capture all the patterns and features in the underlying data. So using more embeddings is not always better.
The difficulty in choosing the right number of embeddings is therefore related to over- / under fitting the model to your data, which is different in every case, as every scenario has different data, so it has to be experimented with.