It seems a lot of active kaggle competitions that reward ranking points are for image processing which requires extensive knowledge of deep learning. Since I am new to data science, will you please provide some tips on how to approach to these type of competitions?
Hey @yyun2 that’s a great question!
The short answer would be: come along to the deep learning course this Monday evening from 6.30pm. Frankly, my view is that deep learning is well on the way to replacing all other types of machine learning, so that’s probably what you want to focus on long-term anyway! We’ll be covering some foundational components on deep learning in the MSAN program, but the certificate course will provide a lot more opportunities to become a very strong deep learning practitioner. Don’t worry if you’ve never done ML before - the deep learning certificate starts from scratch, so as long as you work hard during the course you’ll have an opportunity to learn.
In general, image data is of a kind I call “unstructured” - that is, there aren’t different column with different types of information (like ‘revenue’, ‘zip code’, etc), but every column is of the same kind (e.g. pixel intensities). For this kind of data, K nearest neighbors (which we’ll cover) is quite effective, and you can even pass the pixel values straight into a random forest and get reasonable results! Having said that, you’re not going to place very highly without deep learning, because deep learning nowadays is just so easy to use, and so effective for this kind of data.
If you can’t come to Monday evenings, I can still give everyone in the MSAN program access to the videos, so you can watch there, or go to http://course.fast.ai to watch last year’s version (which is now pretty out of date, so following this year’s is recommended).
Finally, don’t forget about the Kaggle kernels! You can use the Kaggle kernels in the competition to relatively quickly and easily prepare a submission, even if you don’t fully understand what it’s doing. It’s still good practice for you to work on downloading and processing these different kinds of files, and you can try to gradually learn more and improve over time. You can also ask for help here on this forum, or on the Kaggle forums.