The math-shaped elephant in the room

Hey y’all

After doing deep learning for a few years I’m starting to hit a wall where I’m blocked by my lack of math knowledge.

When I started on this adventure I did a few super quick calculus & linear algebra tutorials, which have mosssstly carried me through DNNs, CNNs, RNNs etc etc, but I dug my heels in and refused to do a full semester’s MOOC on any particular topic.

Now I’m mostly interested in GANs, VAEs etc etc, and I find more often than not I’m way out of my depth, especially on the probability / information theory side of things, and especially when the models get more complicated than toy examples. NIPS was super fun but I wish I’d have been able to get more out of it, without my eyes glazing over whenever I saw a long equation.

2018 has to be the year I get over this hurdle - I can’t put it off any longer.

Wondering if y’all could detail your ‘learning math as an adult’ journey — specific books, MOOCs etc (that you actually did, not just things that have been recommended to you).

Interested in efficiency as ever!

9 Likes

Have you learned much probability theory yet? Not quite sure if this is what you’re looking for, but probability could be a fun path to strengthening your math skills in a way that meshes well with machine learning.

Harvard’s Stats 110 class (https://projects.iq.harvard.edu/stat110/home) is very fun, and the mathematical prerequisites are fairly gentle. Just some high school algebra and a bit of calculus. You can gradually dial up your mathematical sophistication as you learn more and more probability.

Another idea would be linear algebra, of course. If your goal is to boost your math/paper-reading skills, maybe give the book Linear Algebra Done Right a look? I feel like it’s probably not a great book for practical applications (maybe try Fast.ai’s linear algebra mooc :slight_smile: for something more practical), but it is very elegant and it has almost no prerequisites. It could help you learn how to read and write proofs, which might help you read the math that shows up in papers.

2 Likes

Start with this. It will tell you where all these statistics comes from. Linear Algebra Done Right is good.

If you’re in the camp of “I learned this at some point 100 years ago and can’t remember any of it”, I recommend both Khan Academy and Andrew Ng’s refreshers; both were fairly digestible, though Andrew Ng’s doesn’t cover statistics or probability.

I agree with @cqfd that ‘Just some high school algebra and a bit of calculus’ is necessary.
You should go deeper into mathematics only if you want to perform research in the field.
But for practical Deep Learning applications, I think you don’t need more than this.
For GANs, VAEs etc. focus on descriptions, on the different steps of implementation as a black box (already defined and verified by scientists, and available inside frameworks) necessary to perform such kind of job. And then try several examples, you will master the process.
Even if I have myself a university background level in mathematics, I think it is not necessary to move deeper with maths if you are not interested in research in the field of Deep Learning.

Please, in addition to what we learned here, you could take a look at Agustinus Kristiadi’s Blog, you should find useful implementations of these, to start your training: https://wiseodd.github.io/techblog/

2 Likes

I’ve benefited from Khan Academy’s Maths track. They have a lot of practice questions. My main criticism of them is that it’s often so segmented / broken down into the specific skill, you lose track of the big picture implementation. But I guess that’s what things like Fast.AI are for.

Thanks for the replies y’all - super helpful (especially the Stats 110 class - thanks @cqfd).

But for practical Deep Learning applications, I think you don’t need more than this.
For GANs, VAEs etc. focus on descriptions, on the different steps of implementation as a black box (already defined and verified by scientists, and available inside frameworks) necessary to perform such kind of job. And then try several examples, you will master the process.

Right - I want to innovate on the model side now — I feel like that’s basically impossible with GANs without understanding information theory, game theory etc.

I don’t like these things being black boxes that I old understand by their fancy names (StarGAN! DiscoGAN!) - whilst I’m getting better at piecing together what’s happening it’s certainly more difficult than the “high school algebra / bit of calculus”.

It might just be me but I find Kristiadi’s blog & implementations super unhelpful — I think he’s writing for a different audience than me, but his code goes totally over my head…

Ok @jongold,
I was thinking in term of architecture of each type of GAN, if you follow well you will see the difference why for example one is called c-GAN and another f-GAN.
For me I have no problem to understand both the theory and the practical parts. And I went in deeper because I am interested in Cognitive Computing Research. The research make it possible to innovate or create new trends.
But, from my experience I thought that developers should be able to perform these tasks following the architecture of each type, without dealing with deep mathematics (perhaps I will write something like an article about later, specifying the differences).
So,I wish you the best on your learning path.

1 Like

Essential Mathematics for Artificial Intelligence started today on edX, offered by Microsoft

1 Like

@helena have you tried that course? (Or anyone else here tried it?) If so, how does it look?

i started but got bored right away as it started with like 1x+1=2 type equations - i bet it gets more complicated but i lost interest… there is a similar course )from Duke) on Coursera - if i recall it’s more lively but still… i think the intro chapters in the DL book are the best refresher…

1 Like

Yup they’re great.

1 Like

There is a small probability refresher in Andrew Ng’s course: http://cs229.stanford.edu/section/cs229-prob.pdf

As for the original question, I found the MIT OCW courses to be very helpful:

Single Variable Calculus: https://ocw.mit.edu/courses/mathematics/18-01sc-single-variable-calculus-fall-2010/

Multivariate Calculus: https://ocw.mit.edu/courses/mathematics/18-02sc-multivariable-calculus-fall-2010/

Linear algebra: https://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/

Probability and Statistics: https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-041sc-probabilistic-systems-analysis-and-applied-probability-fall-2013/
and https://ocw.mit.edu/courses/mathematics/18-650-statistics-for-applications-fall-2016/

The problem with those OCW courses is that if you’re mainly interested in supporting your DL studies, only a tiny fraction is actually relevant. That’s why I like the Goodfellow book introductory chapters - it only covers stuff that you actually need for this topic.

4 Likes