Recommended reading on Transformers?

I watched a very interesting interview of Ilya Sutskever by Lex Fridman,
and he talks about “transformers” (see this part around the 1:00:00 mark).

I see that fastai has support for such models (e.g. tutorial here), but I was wondering, do you have some recommended videos or papers to understand what transformers are?

I recommend starting with the fastai NLP course (videos 17 and 18).

Then, I would go through these resources: