I have some doubts regarding stateful RNN as presented in Lesson6.
bs=64 #What is the use of this? model=Sequential([ # Why do we specify batch size in input? Does this mean that LSTM # will automatically reset itself after so many batches? Embedding(vocab_size, n_fac, input_length=cs, batch_input_shape=(bs,8)), BatchNormalization(), LSTM(n_hidden, return_sequences=True, stateful=True), TimeDistributed(Dense(vocab_size, activation='softmax')), ])
Also, what’s the sense of supplying 64 batches of size 8, if the hidden state doesn’t reset after every batch? wouldn’t it be equivalent to supplying 128 batches of size 4?
# Why are we dividing len(x_rnn) into (bs*bs) blocks? Wasn't it written # that we need input that is multiple of batchsize? So, should we have done # len(x_rnn) - len(x_rnn)%(8*bs)? mx = len(x_rnn)//bs*bs # Also why are we taking first mx samples? # And shouldn't batch size be bs*8 in this case? model.fit(x_rnn[:mx], y_rnn[:mx], batch_size=bs, nb_epoch=4, shuffle=False)
It would be really great if you can clarify on any of the above points.