Thanks for your post!
A thing that helped me a lot to get more intuition on backprop (and chain rule), is to think about the computation graph.
The following videos, from Andrew Ng deep learning course, give a very simple and great visual explanation:
- Intro on computation graph (3 min): https://www.youtube.com/watch?v=hCP1vGoCdYU
- Chain rule visualization with computer graph (14 min) : https://www.youtube.com/watch?v=nJyUyKN-XBQ