Lesson 7: Collaborative Filtering

guptamols · April 14, 2024, 12:39pm

This is my colab notebook where I am trying to re-run the code

I was able to develop the intution in excel and now trying to implement in pytorch.

I am stuck with the step here

class DotProduct(Module):
    def __init__(self, n_users, n_movies, n_factors):
        self.user_factors = Embedding(n_users, n_factors)
        self.movie_factors = Embedding(n_movies, n_factors)

    def forward(self, x):
        users = self.user_factors(x[:,0])
        movies = self.movie_factors(x[:,1])
        return (users * movies).sum(dim=1)

my questions are

From where does Embedding come from? Is this implemented somewhere? Do we need to go there or just assume some look up of embedding is happening?
when we do this:

model = DotProduct(n_users, n_movies, 50)
learn = Learner(dls, model, loss_func=MSELossFlat())

we are passing the class model where in the method forward is not given.
Do the fast AI learner knows that it has to call the method forward.

I am quite confused, any help is appreciated

Isaac.oduro · April 14, 2024, 7:08pm

@guptamols Embedding is a pytorch Module, nn.Embedding. It’s basically a torch tensor that is wrapped by nn.Parameter, So when you call an embedding object with a tensor of ids say tensor([0,1]), you’re basically indexing the rows of the implicit tensor of Embedding object. The second part of your question isn’t clear to me. But basically, you can think of the Learner class as abstraction for the training loop that is typically written in pytorch. It needs dataloaders, dls; a model ( an instance of pytorch or fastai Module), loss function, and optionally, metric function, an callbacks. In the learner, batches of data are passed to the DotProduct instance which is your model model. Example, a batch called x_batch is passed to the model like model(x_batch). When this happens, the forward method is called with the batch, x_batch. We don’t explicitly call model.forward because there some extra code called hooks that run in the background so think of model(x_batch) as model.forward(x_batch). I hope this helps.

guptamols · April 15, 2024, 1:39am

Sure, This clarifies gaps

At this point of time (Lesson 7), I can just use the fast ai class as a black box or I should deep dive into details e.g. what is nn.Parameter?

Looks like writing just using pytorch is part 2 of the course. Just confirming in advance as I proceed to part 2 soon.

Isaac.oduro · April 15, 2024, 3:46pm

@guptamols I think you should finish the course, then work through the fastai book. This should get you feeling confident about using the fastai library. Also don’t hesitate to look at the source code of any piece of object. My personal favorite is using the doc function to read about fastai objects, and source codes.

I found tremendous value in reading the book. Now that I’m getting started with part 2 of the course, I’m more confident in my ability to work through the detail.