I’m trying to understand how the data preparation for a language model works. The get_batch function in LanguageModelLoader returns
I know the purpose of the model is to predict the next word given the preceding sequence, so I was expecting a sample to be a sequence and the label to be the word following that sequence, say data[0:50] and data. However, it seems that the sample and label have the same length, just shifted over 1, so something like data[0:50] and data[1:51]. I can’t quite wrap my mind around how this is working.