I just explored my collab-filter and recognized, that there are some embeddings missing. I tried to get the idx from userId ‘400’, which is in the ratings.csv and got an exeption . I guess it could be a problem with the splitting of CollabDataBunch, but I dk how to fix it .
Here is a minimal example:
from fastai.collab import * import torch import pandas as pd path = 'ml-latest-small/' ratings = pd.read_csv(path+"ratings.csv") movies = pd.read_csv(path+"movies.csv") series2cat(ratings, 'userId', 'movieId') series2cat(ratings, 'userId', 'movieId') data = CollabDataBunch.from_df(ratings, seed=42) y_range = [0, 5.5] learn = collab_learner(data, n_factors=50, y_range=y_range) learn.fit_one_cycle(3, 1e-3) print(ratings[ratings['userId'] == 400].head()) learn.get_idx(['400'])