I’ve been playing around with LSTMs for sequence classification and keep seeing example LSTMs with no explicit timesteps provided.
From this example:
the following simple model is created
model = Sequential()
model.add(Embedding(max_features, 128))
model.add(LSTM(128, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(1, activation='sigmoid'))
I’m curious how this LSTM is fed it’s input. Does it receive one vector of length 128 each timestep or a matrix with dimensions maxlen x 128
?
When a batch_input_shape parameter gets passed in it’s very clear the number of timesteps used. When no, timestep info is provided I assume one vector of length 128 is passed to the LSTM as follows:
model.add(Embedding(max_features, 128))
model.add(LSTM(128, dropout=0.2, recurrent_dropout=0.2, batch_input_shape=(batch_size,1, 128)))
model.add(Dense(1, activation='sigmoid'))
However, I also trained a model as follows and got the same loss value after a few epochs.
model.add(Embedding(max_features, 128))
#note: batch_input_shape change below
model.add(LSTM(128, dropout=0.2, recurrent_dropout=0.2, batch_input_shape=(batch_size, maxlen, 128)))
model.add(Dense(1, activation='sigmoid'))
Maybe keras is ignoring batch_input_shape
keyword argument? Interestingly, when I get a model.summary()
for all of the models mentioned above they appear to have identical layers with an identical output shape and number of parameters.
Any help would be much appreciated!