From my understanding of stateful RNN where batch size > 1, the keras documentation states

all batches have the same number of samples

If X1 and X2 are successive batches of samples, then X2[i] is the follow-up sequence to X1[i], for every i.

This essentially means that the data must be “interleaved” in batches as discussed here:

and also here: https://github.com/fchollet/keras/issues/1820

My understanding the training data has to be re-constructed to look like this:

Sequence: a b c d e f g h i j k l m n o p q r s t u v w x y z 1 2 3 4 5 6

BATCH 0

sequence 0 of batch:

a

b

c

d

sequence 1 of batch:

q

r

s

t

BATCH 1

sequence 0 of batch:

e

f

g

h

sequence 1 of batch:

u

v

w

x

However, it doesn’t look like @jeremy is doing this in Lesson 6 in his notebook for the stateful LSTM (and the batch size = 64). Did Jeremy make a mistake, or do I have a mis-understanding?