Max norm, and Kullback–Leibler divergence aren't cover in linear algebra, probability theory classes

xariusdrake · August 12, 2022, 2:20am

Hi guys, I took these math classes, but when I read the deep learning book by Goodfellow, some concepts like Max norm, and Kullback–Leibler divergence… aren’t covered in these classes.

I want to derive deep learning equations from scratch and understand research papers.

Do I lack some prerequisites? What do I need to learn? Thanks for your help!

jeremy · August 12, 2022, 3:38am

Each DL paper will generally use a different set of mathematical techniques, so the best approach is to learn what you need when you need them. Quite often a paper will be written because it’s using a novel approach (i.e some area of math previously not applied to that area) – so you can’t expect to learn all the math you need such that every paper will be immediately familiar to you!

Goodfellow’s book does a pretty good job of explaining KL Divergence from scratch IMO - let us know if you find anything confusing about his explanation.

I’m not sure what you mean by “derive deep learning equations from scratch” means. In part 2 of the course we cover pretty much all the math used in the fastai lib so maybe that’s a good place for you to look?

xariusdrake · August 12, 2022, 8:01am

Thank you. I will try to master the applied math part of Goodfellow’s book and then learn along the way.

I will look into part 2.

Conwyn · August 16, 2022, 8:00pm

Hi
There is a little bit ( https://www.inference.org.uk/itprnn/book.pdf ) on page 34 section 2.6
Regards Conwyn