In the notebook here (https://github.com/fastai/fastbook/blob/master/12_nlp_dive.ipynb), there is an explanation of how the data is fed into the LSTM:
The first batch will be composed of the samples:
(0, m, 2*m, ..., (bs-1)*m)
then the second batch of the samples:
(1, m+1, 2*m+1, ..., (bs-1)*m+1)
Can someone please explan why the data is fed in like this? Why not like this:
The first batch will be composed of the samples:
(1, 2, 3 ..., (bs-1 + 1))
then the second batch of the samples could be:
(101, 102,..., (bs-1 + 101) )