Blog Post explaining attention

Hi Everyone,
I was not particularly happy with how attention is explained in most blog posts and here I’ve taken a crack at it. I’ve tried to take the fastai approach, removed everything that’s not needed, and let the code do the talking.

I think it’s important to understand how concepts have evolved over time and I’ve picked up the paper that first introduced attention and explained it in the context of that paper.