Hidden State Memory in RNN Predict Calls?

I’m curious about whether anyone has insight into how a Keras RNN/LSTM maintains a memory when calling the model.predict(x) method.

Using the charr-nn text generation as an example, each sequence (eg. ‘I hate this place and’ vs ‘I love this place and’) is cut into multiple sub-sequences of length k, and then converted to a KxD matrix (either one-hot encoded or embedding). Training and prediction have the same protocol. When training, the data is fed in sequentially so the hidden states reflect prior sub-sequences.

However, when calling predict, it looks to me like the prediction only uses data/memory from the exact subsequence fed into it. In the two examples above, if K=10, then subseq=’ place and’. Feeding this into the predict() method doesn’t give the model any context of the ‘love’ vs ‘hate’ sentiment preceding it.

Anyone have any insight into how this is supposed to work, and how to get longer term memory data into predictions?