Understanding LanugageModelData learning

I’m new in NLP.
I have a problem with understanding code from lesson4-imdb ( https://www.youtube.com/watch?v=gbceqO8PpBg&feature=youtu.be&t=6740s )

As I understeand the model should be predict next world in sentence yes?
In md.trn_dl we have vectors representing words (64 sequence), but what is the Y (target) value for this model? This is whole tensor of size 4800? What is the process of training model? I don’t understand what is the exact X and Y variables for model training.

What kind of neural network is used to do this? It’s seq2seq using skip-gram?

Please help :slight_smile: