Is the 5.5 upper bound passed to the sigmoid function on Lesson 6 a mistake?

virior · May 11, 2021, 6:29am

In this lesson we had a model trying to predict ratings for movies. And the upper bound for the Sigmoid was passed as 5.5, and not 5.

So, this 5.5 intrigued me and I was glad that someone asked about it. About this Jeremy gave the answer that since the Sigmoid only reaches the upper bound at infinity, it would be better to pass it 5.5, since that way it could more easily predict a 5.

Well, I found this answer to be reasonable, but it also made me skeptical. So I went to check and I had better results setting it to 5, not 5.5.

To make sure it was not a fluke, o ran it 10 times each, and had the same result. 5 was better.

Of course it could be a property of that specific model, so I rerun this test for all of the models presented in this lesson (all of the versions of the same model, more correctly saying), and had this result:

Except for the DotProduct model with both decay and bias, the 5 scored better in general, and even in this one it was pretty close (and it did a lot better sometimes).

So do this mean that 5 is better than 5.5, at least for this model in general, and Jeremy made a mistake? Or is this specific version, more complete overall, the best criteria?

JackByte · May 11, 2021, 7:46pm

Hi @virior,

try to run pd.describe() and check the statistics of rating. My guess would be, that 5.5 works better because 5 star ratings are less common in the data.

Thanks, your question might help me to create a better performing model I’m doing some tabular data right now, and i just used min and max+something for the y_range. But maybe it would be better to ignore outliers (lets say 99% are below <=4.5 then I’d stick with y_range=(0,5).

virior · May 12, 2021, 6:12pm

Well, just ran it and the breakpoint for 5 ratings is .79, so not that uncommon. I probably should take a look at the statistics of the predictions of both models, maybe they’re distributed differently and this is not shown in the error rate. (Say one has more errors at higher ratings and the other has the oposite)

JackByte · May 13, 2021, 7:10pm

Hm… 0.79 meaning 20% are straight 5 star ratings, right?! That would mean that my guess was wrong. Sorry then I actually have no idea why (0,5) os performing better than (0, 5.5).

Please let me know, If you do further investigations in the error distributions.