I am confused on how this collab filter is different from tabular. Is this just a special case?
Convert emoji’s to words may be using some mappings, before feeding it into NN.
Yes, you need to recognize the tokens somehow to identify them.
Do you ever have to take into consideration that you have multiple samples/observations per subject with deep learning? e.g. when you have multiple movie reviews from the same person, or when you have multiple images from the same brain or slices from the same MRI for classification, or do neural nets not care?
Finetuning. Yes, that is possible and easier in fastai since weights are matched internally. For more info see load_pretrained.
Jeremy just mentioned there are different LMs in the Zoo for different languages. Do you have something “meta” like a LM to do language detection first?
emoji is usually encoded with the word that describes the expression. For example, this is :joy:
. So yes, there is an easy mapping here. Just remove the ::
.
What’s the role of timestamp in collaborate filtering? Does it need to know about movie genres or other meta data about the product ?Should we consider browsing pattern on collaborative filtering?
Happy birthday, Jeremy!
生日快乐🎂
Happy birthday!
Happy birthday!
Parabéns Jeremy, feliz aniversário!!!
Happy Birthday my friend!!
Happy birthday Jeremy!
I remember seeing Netflix blog post on cold start - maybe a good resource in general to read
How do you measure how well you’re doing I’m colaborative filtering ? (Like an accuracy rate for instance)
Let’s have separate thread wishing Jeremy
It is not possible to use tabular learning approaches for collaborative filtering problems. There could be millions of movies or products = millions of colums per user, most ML methods can’t train such a huge and sparse set of data. Collaborative filtering uses tricks to condense it into a smaller space to find a meaningful mapping between users and movies.
Collaborative filtering is used to build recommendation systems where we trying to recommend to a user something using a rating, tabular data is about trying to get a prediction on sale or any other continuous variable