Transformers are the most widespread architecture used for NLP tasks today, from text generation to summarization to translation to classification and more.
In this post, I explain how transformers work, at a high level view and without any mathematics, so you can gain a general idea of their workings.
You can read it here:
If you have any comments, questions, suggestions, feedback, criticisms, or corrections, please do let me know!