Just wanted to add that I’m about 3/4 of the way through The Book of Why, and I’ve gotten a lot out of it. The chapter on looking at statistical paradoxes (Monty Hall, Simpson’s paradox, etc.) through the lens of causality is fabulous, definitely worth buying the book for that chapter alone. Collider bias is one of the most interesting things I’ve learned about in a long time, and it’s so simple!
I’ve also had fun with this course on causal diagrams: https://www.edx.org/course/causal-diagrams-draw-assumptions-harvardx-ph559x. The extra practice working with causal DAGs really complements the book.
About the controversy itself: I don’t really see it as being all that controversial (seems like what LeCun is saying). Deep learning is obviously great, but Pearl’s point is that causality is fundamentally an extra-statistical concept: causality uses probability and statistics (and I’m putting deep learning under that umbrella), but it isn’t reducible to probability and statistics. I think that’s a really exhilarating idea, and not something I had ever thought about before! But thinking that does’t mean you should turn off your GPU and stick to drawing causal DAGs on a whiteboard