Hi All,
Would like to check if my understanding of positional embeddings is correct (from Attention is all you need paper). Have made an excel mock-up of the same. Can someone please take a look and correct it.
The setup looks as follows:
The excel is uploaded here.
I also don’t understand what is the role of the constant=500. Is it the LENGTH or N_WORDS (basically vocab_size) or something else I am missing. Everywhere I have looked it seems to be set to 10_000. Appreciate your help.
Thanks,
Deepak