Likelihood of actually winning a kaggle competition?

So I was just reading an interview of a team who won a kaggle competition not too long ago, and he said they ran a custom network (with like 50 layers mind you) on 55 cores for 4 weeks on their university superpowered GPU system (I am paraphrasing the last bit). They also used some special custom libraries and techniques, including stacking.

Now, I know every competition is different, but how in the world are people learning about ML as newbs, running keras on a single cloud instance, supposed to compete with that?

So far most of my deep learning education has been directed towards figuring out a quality hardware solution for running models, since it seems like having a great GPU with a multithreaded system is pretty much the only way to go. I’ve yet to fully dive into the actual coding because you can’t do much when your basic CNN examples take 8 hours to run.

While everything you’ve stated is technically true, don’t be discouraged in the least.

It’s one thing for society to put restrictions on you; but it’s a completely different thing if you psychologically self-impose said restrictions.

I say this because data science is unlike any other field in the history of humanity. State of the art techniques become obsolete in as short as a month or even a week sometimes. Going from a n00b to a practitioner is little more than continuing to practice and refine your art. And then from practitioner to master is all about investing that hard work, staying up to date on recent research (spend 30 min on arXiv every day), and trying to implement at least on new paper every week–even if it’s on a different dataset.

Even if you don’t become a Kaggle grand master following this regimen, you won’t have an issue landing an awesome job. Besides, there are other DS competition websites out there too.

Considering how many users has, I think it might be interesting to start quarterly competitions here as well.

1 Like

Thanks for your suggestion, I hope I can landing an awesome job at the end.

1 Like

There is also a reason why startups are historically more innovative than large companies. The result is not just about computing power or spending resources; it is based on creativity, innovation, intuition and with a fair touch of self-confidence. Some call this human intelligence. I am far from being an expert but data science doesn’t look any different. In 2017, you can still implement great ideas with a 5 years old windows 7 machine and a gtx 1070 gpu card.

1 Like

Nice, thanks for the words of encouragement. I am a Texan, so I am sure I won’t be lacking for confidence when it comes down to it!

I’m just wondering how feasible it would be and what is the best and most cost-effective hardware setup for machine learning that an average person can construct. When I started learning, I was using a $250 laptop equipped with an AMI graphics card, and it took the cats v dogs CNN like 10 hours to train, so obviously that wasn’t going to cut it. Now I just use a cloud instance and that same CNN takes 6 minutes, but I am wondering if there are any other simple setups that are better than mine, or is this the best I can do for right now.

I’m not going to get hung up on trying to maximize hardware, but I would like to know if there is a “best” setup for the average Joe. Maybe I’ll make this into its own thread

I personally lost too much time trying to win Kaggle competition - any kind.
The best that I was able to make it is 8 place for the Outbrain click prediction, like for a week before the deadline.
Then I had to travel and focus on other task (consultancy project) and didn’t had chance to participate.
Ended up 83th on the public leader-board … quite frustrated I would say.

I think the most ineffective way to learn data-science is to try to win Kaggle competition.
It is so unlikely to happen, so maybe with lottery ticket you got similar chances of winning.
The average hourly rate per Kaggle participant on team level is less than $2/hour.

So if you ask me - just learn form the top results and focus your time and energy to work on real life project.
This way you will start accumulating cool portfolio and gather deep understanding of the projects in production.

At least this worked rather well for me.