Recommender system benchmarks

Hi All –

Jeremy works with the MovieLens dataset in Lecture 5 – I’ve been trying to find somewhere that benchmarks the performance of various recommender system algorithms. There are a number of open source tools for this:

but I haven’t been able to find anywhere (papers, blogs, etc) that compare performance of different methods on some standard benchmark dataset?

~ Ben

The most common one I’ve seen is from criteo, but there’s not really a consensus amongst the community. Some research reports on MovieLens20M but a lot of companies report only on their own data.

Recsys, the main conference in recommender systems has a competition every year that’s also worth checking out. This year is a spotify playlist recommendation problem but unfortunately it was only open to academics and finishes at the end of June.

Their previous datasets for their challenges have been good and are often included in Recsys papers, but there isn’t a gold standard for evaluation so different papers tend to use different datasets.

I’m working in the field right now and will be at Recsys. If you’re interested in the intersection between deep learning and recommender systems hit me up and I’m happy to talk shop. Let me know if you come to a different conclusion about this because I’m working on a paper right now and I’d like to do a few public datasets.


Hey Even! Since this has been some time in the past I’m not sure if your offer to talk about recommender systems still stands, but I would love to have a chat with you if you are up to it.
I just started working on a project that involves recommender systems and it’s my first time working with them. Historically I have worked on a lot of computer vision tasks and I am kind of baffled by the lack of open benchmark datasets where I can directly see comparisons of different algorithms. Also, I was expecting kind of one or two very popular libraries of toolings for recommender systems. Instead it seems to be a bit all over the place as far as I can tell right now!
Happy to hear from you or anyone who feels like answering this call for help for getting into recommender systems. Thanks!

Hey Robin, welcome and absolutely I’m always happy to talk RecSys. Datasets are a big issue in the space but there are a few promising additions to the field since this past post. If you’re really interested in diving in I’d take a look at the RecSys challenge for this year. Twitter has released a 1B tweet dataset and there’s an active competition around predicting interactions there.

That’s a pretty complex jumping off point, so movielens might be a better starting point. I’m now managing a team at NVIDIA working on building tools for the RecSys ecosystem and one of our intro examples uses that. Check out

I’m not on here as often as I used to be but I check back periodically. Looking forward to hearing more about your use case.