How to Win a Data Science Competition: Learn from Top Kagglers


(Pavel Pleskov) #1

There is a new online course on Coursera about Kaggle. It is a part of Advanced Machine Learning specialization.

It is taught by four Russian grandmasters and a well known number three in current rating - Μαριος Μιχαηλιδης (aka KazAnova, creator of StackNet).

The course covers such topics as:

  • data leakages
  • feature preprocessing and generation
  • hyperparameter optimization
  • ensembling

Pros:
:heavy_plus_sign: very practical
:heavy_plus_sign: learn from top guys
:heavy_plus_sign: things that never been taught before

Cons:
:heavy_minus_sign: Russian accent
:heavy_minus_sign: raw homeworks with little explanation
:heavy_minus_sign: $50 per month

Correlation does not imply causation but I took it and became number 160 on Kaggle in three months :slight_smile:

Highly recommend to audit it at least!


(Pavel Pleskov) #2

@jeremy this is the course I told you about on yesterday’s lecture


(Jeremy Howard) #3

Thanks that’s great!


#4

great resource, thanks for sharing @ppleskov!


#5

Thanks!


#6

If anything that’s a Pro! :smile:


(Bhabani) #7

Currently I am going through the course I think its worth giving a try. If anyone here is worried that this course gonna be bad and a waste of time, I my opinion I would say there is nothing to lose. Its a good course.

Note:- Bear the Russian accent. :smile:


(Oshan) #8

@ppleskov thanks for sharing… What was your rank prior to taking the course… we might then be able to make a better comparison…
super excited about this one…


(Dennis Sakva) #9

A simple lifehack: setting video speed to 150% will make it much more bearable. :slight_smile:


(Utkarsh) #10

Going to start with this one! Have been searching for something like this. Thanks a lot!!


(Petr Bela) #11

Half-way through the course:

Positive:

  • don’t spend much time on basics (assumes prior experience with ML)
  • focuses on topics that are not typically covered in depth (data preparation, leakage, metrics, optimizations etc.)
  • combination of videos, notebooks and external links
  • builds towards a final project

Negative:

  • not about deep learning (actually, most Kaggle competitions are better won using boosted trees)
  • skills learned on Kaggle are a bit different from real-world ML
  • $50/month, although you can audit (access all the materials) for free

(Pavel Pleskov) #12

@nahso I didn’t have rating at all so no reference point. but this course and Jeremy’s advice to improve score every day and optimize rating were most valuable for me


(Bhabani) #13

I agree with you. @ppleskov


(Keshav Ramaswamy) #14

Two weeks into the course - one major gripe I have is the assignments are not described properly. This leads to a lot of confusion - but otherwise, plenty of new stuff to learn like data leakage, feature generation etc which usually are not foccused in other courses.