Most Important lesson learned

I wanted to create this post as a warning to my fellow beginners. We’re almost 75% through the course and somewhere in the midst I got side tracked into digging into theory-I felt that the advanced discussions were theory related and perhaps going through some extra material might help me understand stuff better.

However, despite the initial warning from @Jeremy during the first lecture and repeated discussions with @radek I continued on with my mistake by pursuing CS231N, CS22N(which felt like they weren’t completely theoretical) , watching conferences and discussions related to the models we use. I did do the assignments related to uni courses(The courses were still,much more theory intensive.)

I’ve come to a 100% realisation now that a code first approach is the best way of understanding models-even theoretical concepts themselves better.

From now on, for every minute spent reading theory, I will try my best to spend 10 minutes coding. Hope it’s not too late and I will still have enough time to polish myself by constant kaggle-ing and improve myself during the period between Part1v2 and Part2v2.

Please do not commit the same mistake that I did :slight_smile:
Better understanding will definitely come by better coding.

Sanyam Bhutani.


Nice. Relate to this. Haven’t got side tracked by watching other videos, but spent 80% of my time last week on (EDA) data analysis, and the rest on building neural nets. Not sure if it was time wasted as EDA is always needed but I feel like I’m a little behind from last weeks video. Looking forward to today’s class!

1 Like

Thanks for sharing your deeply moving story of discovery! :wink:

(For those who are interested in just a touch more theory, watch lessons 9 onwards in the ML class, which covers much of the same material but in a more bottom up way. Also, the first 8 lessons include a lot of EDA discussion.)


It will not be time wasted but time invested…

Just think how much time Jeremy would have given to make us feel that DL is easy…

And without having a very Deep understanding of the techniques,

How’s that even possible…!!!


In my case, I felt I’ve left out on a lot of ‘doing’ things, for sitting back and ‘understanding’ theoretical concepts: that might help me read a technical blog faster but definitely not helping that much with anything practical.

True. But an overdo of theory didn’t turn out to be good. :sweat_smile:
So I’ll stick with applying the theory more rather than just simply reading Mathy equations.

Interesting. I found that going through the theory helped me a lot in terms of understanding (not the mathy stuff) . Implementing basic stuff in pure numpy helped a lot too. I guess it depends on your learning style.
disclosure: The only thing I read was the deeplearning book

1 Like

I missed out on the implementing part, that was my mistake.

Cant agree more!!!

Here is my analogy:

Reading theory = looking at the map
Coding = Actually picking up your bag, wear your boots and walk the terrain

That is the difference I felt!!!

1 Like

To be honest with you, the Fast AI methodology took me some time to get used to.

Since I have been taught all my student life in a theory first manner, I assume that-that is what is to be done here as well and then I’d set off reading medium posts followed by arxiv papers on Architectures then ending up reading a theoretical mathy book since the lectures (the first 4) let us play around right away with almost very less ‘theory’ and I had felt that: it’s my task to fill that ‘gap’.

I don’t hate math. But I don’t think only studying math had ever helped me directly, unless I’d apply it.

What I realised (and it took me time, sadly) that the fast AI approach, “Dive right in”; “Learn only as much is required” is the fastest way to AI :slight_smile: And I can learn without going into theory a lot. (Which took me a stupid of amount of time, since it’s not how I have been taught at school/college) Also, that the bottleneck in my performance is not my understanding of math but my ability to code it.


@init_27 Just to understand, so based on your experience you would recommend that a better way would be to code the entire implementation (e.g the dogs and cats or dogbreed notebook without referring to jeremy’s solution) and play around by changing the values of different parameters (like batch size, image size) and understand how the model performs is it?

1 Like

Yes, Try coding it without looking at the nb. Cheat once or twice if you have to. And try to replicate the same results by using the same techniques.
It sounds easy but when you begin at first, it’s slightly difficult.

1 Like

As one considers how to get the most out of this course, I think it’s worth noting that there’s more avenues for learning this stuff than the binary between “code” and “math equation theory”. There’s also blog posts, and books with abstract, but very useful explanations of underlying concepts, or with really contrived examples designed to prove a broader point. I personally I found that moving in a cycle between “abstract” and “practical” to be quite helpful. A bit like zooming in and out.
Specifically, I would watch Jeremy’s videos, then try to re-do the notebook myself. I wouldn’t always “get it” though, so then I would read other people’s blog posts, or book chapters attempting to explain the same thing. I found this really useful, cause different people will explain the same thing from slightly different angles, and it helps you see the bigger picture. I would also often google things like “intuitive explanation of {some concept}”. Usually stuff comes up. Those explanations aren’t “practical”, per se, but they’re super useful for putting your later coding experience into context.
I would then come back to the notebooks and/or watch Jermy’s videos again, and would often “get” a lot more, and then be able to apply it or see the “theory” in action.

Even more specifically… I found the book Grokking Deep Learning to be a great companion to this course. You won’t learn any tips for boosting your model scores by 0.5%. But you’ll get loads of intuitive explanations, and analogies for trying to really grasp what NN’s are doing and why they’re doing it, which is invaluable for applying the architectures to your specific problem. And you’ll build them from scratch with just python and numpy to help put things into more concrete focus. There’s no math. Just the written word and code samples.


I completely agree with your point. But I’d personally deviate to get the hang of depth of every mathy variable.

I wouldn’t emphasise on Fast AI’s content as much, rather-Say after the first lecture I’d wander off reading about the variations of Gradient Descent.
And Understanding Resnet Architectures.

Fast AI recommends putting in 10-20hr/week, and I believe that time is to be dedicated to the approach you’ve mentioned :slight_smile:

And Yes, the book is really a good place to start with! Even if you’re not good at Python. By far, it’s the best intuitive explanation of everything at such a basic level imo.