I’ve been trying all day to make a custom collaborative filtering. I took some inspiration from posts like this [https://towardsdatascience.com/fast-ai-season-1-episode-5-3-collaborative-filtering-using-neural-network-48e49d7f9b36]. But it is just not working. In fact, not even the example from the link seems to work; lr_find() gives an error about mismatching input and target to calculate the loss. What am I doing wrong?
Here’s my code so far:
n_factors=50 # Number of factors
n_mols = len(train_df['userid'].unique()) # Number of users
n_targets = len(train_df['itemid'].unique()) # Number of items
n_mols,n_targets
class CollabChEMBL(nn.Module):
def __init__(self,n_mols,n_target,n_factors,y_range):
super().__init__()
self.y_range = y_range
self.m_weight = embedding(n_mols, n_factors) # User embedding
self.t_weight = embedding(n_target, n_factors) # Item embedding
self.lin1 = nn.Linear(100,10)
self.lin2 = nn.Linear(10,1)
def forward(self,mb,tb):
cat = torch.cat([self.m_weight(mb),self.t_weight(tb)],dim=1)
x = F.dropout(cat) # Dropout_1
x = self.lin1(x) # Linear_1
x = F.relu(x) # Activation_1
x = F.dropout(x) # Dropout_2
x = self.lin2(x) # Linear_2
x = torch.sigmoid(x) * (self.y_range[1]-self.y_range[0]) + self.y_range[0] # Transform output using sigmoid
return x
wd = 1e-2
criterion = nn.MSELoss() # MSE
model = CollabChEMBL(n_mols, n_targets, n_factors,y_range)
model
learn = Learner(data,model,loss_func=criterion,metrics=root_mean_squared_error,model_dir='/kaggle/working/')
learn.lr_find()
RuntimeError: index out of range at /opt/conda/conda-bld/pytorch-cpu_1556653093101/work/aten/src/TH/generic/THTensorEvenMoreMath.cpp:193
Small edit:
For some reason when I add +1 to the embeddings lr_find() runs just fine.
self.m_weight = embedding(n_mols+1, n_factors) # User embedding
self.t_weight = embedding(n_target+1, n_factors) # Item embedding