I’ve been playing around with Collaborative Filtering and the MovieLens dataset and I was wondering…

**If I only wanted to get the best embeddings for each of the movies (later to be used for projections and distance calculations) – does generalization or over-fitting matter?**

If I never plan on using new data, and I just want the most accurate vector representation for each movie in the dataset, then I feel like it doesn’t matter if I overfit and the lower the error the better.

… or am I misunderstanding / missing something?

**What I really want is the best vector representation of a movie as possible, to calculate the similarity between movies based on the [Euclidean] distance between embeddings** (is there a better distance metric to use in this context?). If there is a more efficient way to do this (dl or ml solution welcome), please let me know this as well (or instead )

Thank you!