What I will focus on to succeed in this course

Really liked your resolve @radek! :slight_smile: And super thoughtful to add the ‘what not to do’ section :+1:

All the best!

5 Likes

Hope you achieve all this @radek. All the best!

1 Like

I liked this part as I wondered maybe we can create fastai group to participate in various Kaggle competitions (not limited to images recognition)? Who is in?

7 Likes

I would definitely love to be part of that.
Although I’m merely a beginner compared to all the pros around me :slight_smile:

2 Likes

Please be careful to not share any code / data related to a Kaggle competition privately outside of an official Kaggle team. Any information shared in such a manner should be also shared back with the Kaggle community. This would be breaking Kaggle rules and would also be not so great sportmanship.

I am fully aware no foul play was intended here, but if we were to form a fastai group and it were not to be an official Kaggle team, we would be effectively at an unfair advantage vs all the other people participating in the competition. This also means that if some of us will form an official Kaggle team, they should not be using a forum thread accessible to others to communicate / share code / discuss ideas.

Don’t mean to sound harsh and I think your instinct to collaborate is awesome and is what learning is about - hence we have these forums :slight_smile: Just wanted to chime in to make sure we stay on the right side of the law :wink:

One thing that maybe we could do would be to start a thread specifically for forming teams from within our cohort. Maybe everyone interested could write a couple of words on what competition they would be interested in collaborating on and how they would like to communicate / what their background is? Once people would agree to form a team, they could just head over to Kaggle and make it official and figure out via which channels to communicate? If anyone finds this idea interesting please feel free to start the thread :slight_smile:

Also, I think there would be a lot of value in taking what we learn in this course and applying it to different datasets. Maybe we could have a challenge thread where we could try applying info from lecture X to this and this dataset and discuss issues / approaches / results? Happy to facilitate such threads if there would be interest.

BTW. if someone is interested in Kaggle competitions I seem to have created a thread some time ago on the Dogs vs Cats competition that doesn’t seem to want to die :slight_smile: It is here:

I guess I was amazed how well I and other people from the course did with just material from the first couple of lectures :slight_smile: The stuff we will learn here is really powerful.

Anyhow, have been gone from DL for too long so looking forward to getting my feet wet again :slight_smile: And maybe once a competition finish, we might have another thread like the above to discuss our approaches and share experience, but I guess we need to wait till a competition ends.

BTW. in reality I am a newb to all this, and certainly to participating in Kaggle competitions, thus hoping I did not get much wrong with regards to the rules, but please take what I write with a grain of salt :wink: Any definitive info on what is allowed and what isn’t should probably come from @jeremy or directly from Kaggle rules.

This course is happening and we are here!!! Oh man this will be amazing!

5 Likes

Haha. Nice trick to challenge yourself @radek. I wish you all the best for the next 7 weeks, it’s surely going to be an awesome journey.

Your post reminds me of another post by Brendan, a fast.ai alum (Lesson 11 wiki). Soon, there will be a flurry of material and as we keep learning new things every week, we will also realise there’s a lot more to try / learn in the process. It’s going to be exciting!

To keep my motivation up when the going gets tough, I saved this answer - StackGANs Video Project? from Jeremy when a student asked him “how can we possibly build something new in AI and compete with companies like Google / Facebook etc., when they have so much of training data and resources?” His response is Gold ! :slight_smile:

11 Likes

Though I am also a beginner but highly interested to learn cool stuffs and Exploratory Data Analysis by doing them…

oh, the serendipity! i loved Jeremy’s response, and was just about to search for it :slight_smile:

1 Like

@radek doing this course from scratch (without using the fastai lib) will be much harder than just avoiding utils.py from last year’s part 1. utils.py was just a few little convenience functions, but fastai contains the result of quite a bit of new research. You might be better off updating your goal to something involving helping to document or improve the fastai lib, rather than avoid it entirely.

Rewriting bits of it from scratch will certainly help you learn the material - just want you to not be too disappointed if you can’t do all of if from scratch!

23 Likes

Several of @radek reads are available on Safari Books Online

  1. The Pleasures of Counting by Korner
  2. Introduction to Probability by Blitzstein
3 Likes

Perfect!

This motivates me. Thank you!

1 Like

I’m late to the discussion but for @radek, @sermakarevich, @jamesrequa et al. who are interested in Kaggle competitions as a mean to explore and learn more, here’s a course you may consider after DL Part 1 V2, ML1 V1 and before DL Part 2 V2 :upside_down_face:

How to Win a Data Science Competition: Learn from Top Kagglers
https://www.coursera.org/learn/competitive-data-science

One of the instructors is “KazAnova”, current #2 GrandMaster.
https://www.kaggle.com/kazanova

The cost is monthly, no matter how many courses you attend: after the 7-day free trial, you pay $49 USD per month.

8 Likes

You can also just audit the course :slight_smile: It will not give you a sticker at the end, but you get all the knowledge!

5 Likes

I do not see any light at the end of the tunnel with fastai and you are mocking me with new course :wink: I wonder how @radek leave with his plan :nerd_face:?

2 Likes

A secret (and now it’s out in the open :wink: ): apply for financial aid for these Coursera courses, and you will almost always get it.

2 Likes

You could audit it first, go through the lectures and then start the free trial once you’re through with the lectures if you want to avoid the 15 day waiting period for the financial aid acceptance

Maybe a bit of an update would be in order :slight_smile:

To quote Ali Muhammad:

Everyone has a plan until they get punched in the face.

meaning that our plans often don’t stand a chance when confronted with reality. And so has the situation been in my case.

Jeremy shares so much great information and it is coming at such a pace that even if I were to spend all my waking hours only on this course, I wonder if I would be satisfied with my progress :slight_smile: .

As is, there are so many things I would like to try out and look into that I feel I am doing the bare minimum and still am not caught up with recent lectures.

Someone needs to get into a time machine and slap some sense into Radek from a couple of weeks ago! How was that plan even remotely narrow? What definition of laser focus is that?! :smiley:

Let’s say my definition of laser focus evolved and it entails: much less time reading, much less time watching, much more time coding.

Attempting doing this might have been my biggest mistake to date in this course, or a touch of - deranged but still - genius. The jury is still out :slight_smile:

(This relates to the project I describe here - essentially implementing the training loop from scratch, building functionality to work with models, etc. Learning a lot of PyTorch / Python which is good - maybe - but at the cost of doing actual deep learning)

First part is done, box is up and running, second part - not sure if worth pursuing any more given how little time is left. Even if I were to figure out how to train on the 50 GB of data, I doubt this would get me very far in the competition. Too little, too late.

Failed on that one miserably numerous times.

I might be getting there :slight_smile:

In summary, I think I am learning a lot and am potentially moving at the highest pace that is still relatively reasonable. As I get to only be me and not someone else, whether I would be better off if I were a better programmer / knew more Python / PyTorch to start with - probably, but none of this is an option, and accepting that I have to learn things that might seem very simple is the right way to go.

Would I be better off if I didn’t jump into peeking behind the curtains and used higher level functionality vs putting things together myself? Maybe - but if this is the case than this is a lesson I still haven’t and need to learn. I already have come a long way since watching the first lecture with Jeremy and it is probably hard to imagine what a hopeless theoretician I was back then (especially that going the theory route without practice is the easiest way to go astray, but what else to do as no other resource as far as I know bridges the gap between not knowing and doing as this course).

As is, approaching this as a marathon is the best I can do. Willing to go at this full speed for the next couple of months as I honestly feel I finally am devoting my time to the things that matter, I have a relatively okay-ish handle on learning. If doing so leads to nowhere apart from me learning something I always wanted to learn - which I feel is by far the most likely outcome, then be it.

18 Likes

You’ll be more than ready to tackle part 2 by then!

4 Likes

This is not even close to top 10% and as my data center AKA my parents house gave out (power outage till evening at least) this is likely going to be my final submission :slight_smile:

A simple voting ensemble I somehow remembered reading somewhere about and cooked up quickly in the heat of the battle. 4 models - the best being densenet201 achieving ~0.62 on the LB. Trained only the classifier part :slight_smile:

Above all, despite the relatively poor results, taking part in this felt really good and I had a lot of fun :slight_smile:

Here are the main lessons I learned:

  • Always start with completing a full pass all the way to submission as early as possible. What I mean by this, is that it is extremely crucial to have a bird’s eye view of everything that you will have to deal with. You cannot assume things - likely this activity will uncover quite a few things you have not anticipated and that might be harder to deal with (or impossible) to deal with effectively if you sink a lot of time into perfecting the earlier stages of the pipeline.
  • It’s all about IO with datasets large relative to your HW. Just reading the 200GB of data of my HDD takes I believe over half an hour! If I ever again work on something this size, I will want to put some serious thought into RAID0, getting an SSD, more RAM, ways to preprocess the dataset, etc.
  • Frequent the kaggle forums especially for the competition you are taking part in! Such high quality posts in there and a lot of good pointers how to attack a problem!

All in all, I find that working on this was time well spent :slight_smile: But I feel it is important to not lose momentum. Would love to participate in the icebergs or the favorita competition, but the wise choice is probably revisiting collaborative filtering, coding up those RNNs and rewatching lectures :slight_smile: So this is the direction that I will try to point myself in :wink:

12 Likes