Hi community,
I just explored my collab-filter and recognized, that there are some embeddings missing. I tried to get the idx from userId ‘400’, which is in the ratings.csv and got an exeption . I guess it could be a problem with the splitting of CollabDataBunch, but I dk how to fix it
.
Here is a minimal example:
from fastai.collab import *
import torch
import pandas as pd
path = 'ml-latest-small/'
ratings = pd.read_csv(path+"ratings.csv")
movies = pd.read_csv(path+"movies.csv")
series2cat(ratings, 'userId', 'movieId')
series2cat(ratings, 'userId', 'movieId')
data = CollabDataBunch.from_df(ratings, seed=42)
y_range = [0, 5.5]
learn = collab_learner(data, n_factors=50, y_range=y_range)
learn.fit_one_cycle(3, 1e-3)
print(ratings[ratings['userId'] == 400].head())
learn.get_idx(['400'])