how does computer know which is the best value for the embedding matrix. how will it calculate the loss
I’ll try to answer.
The weights in the embedding matrix aren’t different than weights in any other layer. They get updated through back propagation.
It would be easier to answer your question if you gave us a bit more context. Are you currently watching the course?
You can look at word2vec for an example: the input is an embedding layer, the output is a softmax layer and in the middle you have a bottle neck. For training you choose pairs of words that are in proximity in a sentence and use one for the input and and one for the label, do cross entropy on the softmax and the backprop loss will move the input word in the embedding space in order to mimics the loss.
how do we calculate the loss for embeddings. for ex I only want to create embeddings no other layer so input are the letter then a matrix with random numbers, then it guesses a output for which letter it is. next step is to calc the loss. So how does it do that