Lesson 6: Some doubts in stateful RNN

I have some doubts regarding stateful RNN as presented in Lesson6.

bs=64   #What is the use of this?

        # Why do we specify batch size in input? Does this mean that LSTM
        # will automatically reset itself after so many batches?
        Embedding(vocab_size, n_fac, input_length=cs, batch_input_shape=(bs,8)),
        LSTM(n_hidden, return_sequences=True, stateful=True),
        TimeDistributed(Dense(vocab_size, activation='softmax')),

Also, what’s the sense of supplying 64 batches of size 8, if the hidden state doesn’t reset after every batch? wouldn’t it be equivalent to supplying 128 batches of size 4?

# Why are we dividing len(x_rnn) into (bs*bs) blocks? Wasn't it written
# that we need input that is multiple of batchsize? So, should we have done
# len(x_rnn) - len(x_rnn)%(8*bs)?
mx = len(x_rnn)//bs*bs
# Also why are we taking first mx samples?
# And shouldn't batch size be bs*8 in this case?
model.fit(x_rnn[:mx], y_rnn[:mx], batch_size=bs, nb_epoch=4, shuffle=False)

It would be really great if you can clarify on any of the above points.

Thank You