Lesson 7 - Official topic

sgugger · April 29, 2020, 1:52am

It’s another hyper-parameter you have to pick, so you can try a few values and see what works best.

chengwliu · April 29, 2020, 1:52am

any advice on how to select the best wd (the weight decay hyper-parameter)?

Raymond-Wu · April 29, 2020, 1:52am

More data is better! If you have no data look up solutions on the cold start problem

Nonnormalizable · April 29, 2020, 1:52am

What motivates learning at 50 dimensional embedding and then using PCA to reduce to 3, versus learning a 3 dimensional embedding?

barnacl · April 29, 2020, 1:53am

so this is not selected how we pick for categorical columns based on cardinality? @sgugger

Raymond-Wu · April 29, 2020, 1:53am

Answered https://forums.fast.ai/t/lesson-7-official-topic/69896/18?u=raymond-wu .

ganesh.bhat · April 29, 2020, 1:53am

If we dont have user-item ratings, which is a case for B2B product recommendation, how do we build this matrix?

ilovescience · April 29, 2020, 1:53am

Probably mainly because you can learn a lot more with 50-dimensional embedding, but it’s hard to visualize, hence the use of PCA

gamino · April 29, 2020, 1:54am

Would having a larger dimensional embedding, be better for using CosineSimilarity? In other words, more precise?

raphaelr · April 29, 2020, 1:54am

Question: is it possible for a collaborative filtering model to learn by omission? For example, if I hate horror movies and therefore never watch and rate them, would it be possible for the model to “learn” that preference (outside of second-order associated preferences)?

harish3110 · April 29, 2020, 1:54am

How does looking at the top biased movies help understand the movie likability? I think I missed this one. How does low bias equal likability?

giacomov · April 29, 2020, 1:55am

One idea is to use metrics like “how many people have bought/looked at/put in the cart this product”, for example.

ilovescience · April 29, 2020, 1:55am

I think it was opposite. High bias is correlated with likability.

EDIT: But it also means much more than that. It takes into account the genres or actors or other factors. For example, movies with low bias means even if you like these types of movies you may not like this movie (and vice versa for movies with high bias).

harikrishnanrajeev · April 29, 2020, 1:55am

what would be your preferred (best) similarity measure technique apart from cosine similarity?.

sgugger · April 29, 2020, 1:56am

It’s the score of the movie, independent of the user preferences. Like if it’s a romance, a user that usually hates romances would rate it low and a user who loves romances would rate it high, but that bias is the value you add once you have taken into accounts those matches (or mismatches) between user preferences and the movie features.

geoffpidcock · April 29, 2020, 1:56am

My 2 cents - maybe this wouldn’t be a collaborative filtering use case. You might instead want to look at “propensity to buy” for your accounts - i.e. what have other accounts similar to the one of interest bought in the past?

Raymond-Wu · April 29, 2020, 1:56am

My understanding is that it could. If you liked a type of movie, say romance, it could look at other users that liked those category of movies and infer that they also disliked horror. Thus, you might also dislike horror

harish3110 · April 29, 2020, 1:58am

So a high bias indicates the possibility of a user liking a movie despite it being different from his general preferences?

ganesh.bhat · April 29, 2020, 1:58am

We can surely look at past history of sales but since there is time aspect to the sales, can we build such a user item matrix based on an average past sales?

sgugger · April 29, 2020, 1:59am

Exactly (or conversely a low bias means someone will probably dislike it, despite their preferences).