RNN input shape

I’m having a hard time understanding what shape the input should be.
in particular, in class 6 Jeremy reshapes the input:
x_rnn=np.stack(xs, axis=1)
y_rnn=np.expand_dims(np.stack(ys, axis=1), -1)
x_rnn.shape, y_rnn.shape
((75110, 8), (75110, 8, 1))

I cannot get my head around the shape of the Y_rnn, and I can’t find any explanation in the documentation…
why do we need to add another single dimension that doesn’t do anything…? all we really want is to get 8 characters, why do they have to have another dimension…?
also, and this may be related… why does Jeremy stop using “Flatten” after the embedding layer?

thank you

1 Like

an array with shape 8 and 8,1 are identical.
But they are about to fed into an RNN and which requires an input tensor of 4 dimensions.

If you recall the RNN diagram in the lecture, the RNN has 8 different matrices of 1 input each and each one will have its own set of weights that combines the current character with the previous character(s), which is different from a single fully connected array where all the inputs share the same set of weights.

but the (75110, 8) shape is for the input, and the (75110, 8, 1) is the output… why would they be different…? I get that the RNN expects (time steps, input_dim), but we’re actually still giving it only two dimensions… and it outputs 3 dimensions… why?