Can't get custom collaborativeNN to work

marcossantana · October 18, 2019, 4:56am

I’ve been trying all day to make a custom collaborative filtering. I took some inspiration from posts like this [https://towardsdatascience.com/fast-ai-season-1-episode-5-3-collaborative-filtering-using-neural-network-48e49d7f9b36]. But it is just not working. In fact, not even the example from the link seems to work; lr_find() gives an error about mismatching input and target to calculate the loss. What am I doing wrong?

Here’s my code so far:

n_factors=50 # Number of factors
n_mols = len(train_df['userid'].unique()) # Number of users
n_targets = len(train_df['itemid'].unique()) # Number of items
n_mols,n_targets

class CollabChEMBL(nn.Module):
    def __init__(self,n_mols,n_target,n_factors,y_range):
        super().__init__()
        self.y_range = y_range
        self.m_weight = embedding(n_mols, n_factors) # User embedding
        self.t_weight = embedding(n_target, n_factors) # Item embedding
        self.lin1 = nn.Linear(100,10)
        self.lin2 = nn.Linear(10,1)
    
    def forward(self,mb,tb):
        cat = torch.cat([self.m_weight(mb),self.t_weight(tb)],dim=1)
        x = F.dropout(cat) # Dropout_1
        x = self.lin1(x) # Linear_1
        x = F.relu(x) # Activation_1
        x = F.dropout(x) # Dropout_2
        x = self.lin2(x) # Linear_2
    
        x = torch.sigmoid(x) * (self.y_range[1]-self.y_range[0]) + self.y_range[0] # Transform output using sigmoid

        return x

wd = 1e-2
criterion = nn.MSELoss() # MSE
model = CollabChEMBL(n_mols, n_targets, n_factors,y_range)
model
learn = Learner(data,model,loss_func=criterion,metrics=root_mean_squared_error,model_dir='/kaggle/working/')
learn.lr_find()


RuntimeError: index out of range at /opt/conda/conda-bld/pytorch-cpu_1556653093101/work/aten/src/TH/generic/THTensorEvenMoreMath.cpp:193

Small edit:
For some reason when I add +1 to the embeddings lr_find() runs just fine.

    self.m_weight = embedding(n_mols+1, n_factors) # User embedding
    self.t_weight = embedding(n_target+1, n_factors) # Item embedding