Implementation of collaborative filtering for implicit feedback

(Nick) #1

Hi, i am trying to implement collaborative filtering for implicit feedback dataset based on Lesson 5 https://github.com/fastai/fastai/blob/master/courses/dl1/lesson5-movielens.ipynb and this code https://github.com/maciejkula/spotlight/blob/master/spotlight/factorization/implicit.py. The issue is that the loss is wiggling around 1.0 and not going down. I tried different learning rates, weight decay, etc. I must be doing something wrong or there is a mistake somewhere in the my code. Here is the full code:

from fastai.imports import *
from fastai.column_data import *

class ScaledEmbedding(nn.Embedding):
    def reset_parameters(self):
        self.weight.data.normal_(0, 1.0 / self.embedding_dim)
        if self.padding_idx is not None:
            self.weight.data[self.padding_idx].fill_(0)
class ZeroEmbedding(nn.Embedding):
    def reset_parameters(self):
        self.weight.data.zero_()
        if self.padding_idx is not None:
            self.weight.data[self.padding_idx].fill_(0)

class ImplicitFeedBackModel(nn.Module):
    def __init__(self, n_factors, n_users, n_items):
        super().__init__()
        self.n_items = n_items
        self._random_state = np.random.RandomState()
        self.u = ScaledEmbedding(n_users, n_factors)
        self.i = ScaledEmbedding(n_items, n_factors)
        self.ub = ZeroEmbedding(n_users, 1)
        self.ib = ZeroEmbedding(n_items, 1)

    def forward(self, users, items):
        #import pdb;pdb.set_trace()
        um = self.u(users) * self.i(items)
        positive_predictions = um.sum(1) + self.ub(users).squeeze() + self.ib(items).squeeze()
        random_items = self._random_state.randint(0, self.n_items, len(users), dtype=np.int64)
        um2 = self.u(users) * self.i(V(random_items))
        negative_predictions = um2.sum(1) + self.ub(users).squeeze() + self.ib(V(random_items)).squeeze()
        return (positive_predictions, negative_predictions)

class ImplicitCollabFilterDataset(CollabFilterDataset):
    def get_model(self, n_factors):
        model = ImplicitFeedBackModel(n_factors, self.n_users, self.n_items)
        return CollabFilterModel(to_gpu(model))

    def get_learner(self, n_factors, val_idxs, bs, **kwargs):
        return ImplicitCollabFilterLearner(self.get_data(val_idxs, bs), self.get_model(n_factors), **kwargs)

class ImplicitCollabFilterLearner(Learner):
    def __init__(self, data, models, **kwargs):
        super().__init__(data, models, **kwargs)
    def _get_crit(self, data): return pointwise_loss

def pointwise_loss(x,y):
    positives_loss = 1.0 - F.sigmoid(x[0])
    negatives_loss = F.sigmoid(x[1])
    loss = positives_loss + negatives_loss
    return loss.mean()


path = 'data/ml-latest-small/ml-latest-small/'
ratings = pd.read_csv(path+'ratings.csv')
val_idxs = get_cv_idxs(len(ratings))
n_factors = 32
bs = 256
dataset = ImplicitCollabFilterDataset.from_csv(path,'ratings.csv', 'userId', 'movieId','rating')
learner = dataset.get_learner(n_factors, val_idxs, bs, opt_fn=optim.Adam)
learner.fit(1e-2, 3, cycle_len=1, cycle_mult=2)

Maybe someone already tried to implement this approach.
Please advise.
Thanks.

0 Likes

(Nick) #2

Ok, i figured this out. The issue was, that i returned a tuple from the forward function and in model.py we have this line of code:

if isinstance(output,tuple): output,*xtra = output

I changed return type to list and it works fine.

0 Likes

(Masaki Kozuki) #3

Does this refer fastai/fastai/model.py?

If so, you can convert a tuple to a list by below snippet.

if isinstance(output, tuple):
    output = list(output)
0 Likes

(Nick) #4

yeah, i just changed the forward method to return the list instead of tuple.

return (positive_predictions, negative_predictions) 

->

return [positive_predictions, negative_predictions]
0 Likes

(Masaki Kozuki) #5

Ah, ok.

Sorry for my misunderstanding.

Replacing the line in fastai/fastai/model.py as below will work!

if isinstance(output, (list, tuple)):
    output, *xtra = output
0 Likes

(MARTIN Alix) #6

shouldn’t the loss function pointwise_loss(x,y) also take y into account ?

you seem to assume that it will always be called with an X that contains positive examples in the first half and negative ones in the second half

0 Likes

(Nick) #7

Right. That is because y contains rating values, which we don’t care about in implicit feedback model.

0 Likes

(Joakim Rishaug) #8

How did this mixture of implicit feedback for collab filtering work out @bny6613 ? :slight_smile:
Did you do any benchmarks against pure Spotlight implicit models?

0 Likes

(Yash Mittal) #9

Hi @bny6613 I am trying to build a recommender system only with implicit features. I am having more than one implicit features like demographic details of user, personal user details and similarly I have multiple features for the item. I searched number of places but what I got is most of the people use only 1 intrinsic feature. Can you please suggest me how can I make a recommender using multi-features?

1 Like

(Paul JL Wu) #10

@bny6613 Hi Nick, did you work out a solution? I am very interested to take a look at it if you can share it on place like github. Thanks Nick!

0 Likes

(Paul JL Wu) #11

Thanks for sharing your solution!! Do you know how to write a script on how to recommend items(products)? i.e. recommendation(user_id = 314, n_item = 10) to return 10 recommend products for user 314, where the recommended products should NOT include those user 314 already purchased.

0 Likes

(Christiana parker) #12

Thanks for the information. you’ve done a great job, please post the next post for the more knowledge. i want to suggest you guys if you want to Fix Epson Printer Error Code 0xf1 so must check out the link mentioned above

0 Likes