I’m struggling to fit the collaborative filtering embeddings example from this lesson into my understanding for how a deep neural net works. I’m sure my thinking / mental model is wrong somewhere and it would be really helpful if someone could point out where I’m making a mistake.
When I think about forward propagation in a neural net I think about this equation: output = activation_function(weight * input + bias). What I can’t figure out in the collaborative filtering embedding example is whether the embeddings (users & movies) are the weights or the inputs.
My initial reaction is that they are the weights since we are using SGD to find them. But if they are the weights then what is the input and where is the input being multiplied by the weights (as per the forward function I outlined above)? All I see is the dot product of users and movies but those are the weights and I was expecting something like y = wx + b.
Would really appreciate it if anyone can help clarify
The embedding layer has weights. The inputs to the model / to that layer are the user/movie ids. You can think of each id as a one-hot-encoded vector which gets multiplied by the embeddings weights matrix.
Hi all. I was trying to understand how does the Google paper mentioned in chapter 9 of the book implement the dot product model (mentioned in chapter 8 of the book) as part of their “wide network”.
They pass features through a “cross product transformation”, and then take a dot product with a vector of learnable weights. The cross product transformation seems to be multiplying two or more binary categorical features together. How is this process same as our “dot product” approach for collaborative filtering? This is confusing to me because we don’t multiply two or more binary variables together anywhere in our dot product approach.
Any help in understanding would be super useful. Thanks!
Working my way through the Kaggle NoteBook Scaling Up: Road to the Top, Part 3 and running into a few issues. Looks like convnext_small_in22k no longer included within the timm library, which is to be expected given my late arrival to the party!
Tried something that looked similar convnext_small.fb_in22k but get this error:
For a visual multilabel classification task I am trying to implement a Multitarget-Model based on Lesson7,Multi-target: Road to the Top, Part 4.
In section ‘Replicating the disease model’, Cell 8, Jeremy proposes a way to explicitly pass a loss- and errorfunction to the learner.
anyone tried Multi-Target classification with Titanic dataset. I am not able to correctly pass in the customized loss function trying to predict Survived and Pclass at the same time. Any notebooks where this has been tried and worked?
I got the same error. Issue was that newer timm library was getting installed that didn’t seem to be compatible with the notebook. Setting the timm library to the exact version fixed the issue for me (though session has to be restarted for it to work, just rerunning this code won’t overwrite the library with the older version)
@jeremy Why did you choose 5e-3 as the learning rate in fit_one_cycle() in the “Collaborative filtering deep dive” notebook? This number seems magical to me. Can you help me understand how you arrived at it?
Another question I have about the “Collaborative filtering deep dive” notebook is:
Why does the collaborative filtering model built with fastai’s collab_learner outperform the deep learning model built with the CollabNN class? It seems to perform even worse when I add the extra hidden layers as such