I have some doubts regarding stateful RNN as presented in Lesson6.
bs=64 #What is the use of this?
model=Sequential([
# Why do we specify batch size in input? Does this mean that LSTM
# will automatically reset itself after so many batches?
Embedding(vocab_size, n_fac, input_length=cs, batch_input_shape=(bs,8)),
BatchNormalization(),
LSTM(n_hidden, return_sequences=True, stateful=True),
TimeDistributed(Dense(vocab_size, activation='softmax')),
])
Also, what’s the sense of supplying 64 batches of size 8, if the hidden state doesn’t reset after every batch? wouldn’t it be equivalent to supplying 128 batches of size 4?
# Why are we dividing len(x_rnn) into (bs*bs) blocks? Wasn't it written
# that we need input that is multiple of batchsize? So, should we have done
# len(x_rnn) - len(x_rnn)%(8*bs)?
mx = len(x_rnn)//bs*bs
# Also why are we taking first mx samples?
# And shouldn't batch size be bs*8 in this case?
model.fit(x_rnn[:mx], y_rnn[:mx], batch_size=bs, nb_epoch=4, shuffle=False)
It would be really great if you can clarify on any of the above points.
Thank You