Get vectors from an rnn during evaluation

I’ve got the wikitext103 rnn model that looks like this:
SequentialRNN(
(0): RNN_Encoder(
(encoder): Embedding(6408, 400, padding_idx=1)
(encoder_with_dropout): EmbeddingDropout(
(embed): Embedding(6408, 400, padding_idx=1)
)
(rnns): ModuleList(
(0): WeightDrop(
(module): LSTM(400, 1150)
)
(1): WeightDrop(
(module): LSTM(1150, 1150)
)
(2): WeightDrop(
(module): LSTM(1150, 400)
)
)
(dropouti): LockedDropout()
(dropouths): ModuleList(
(0): LockedDropout()
(1): LockedDropout()
(2): LockedDropout()
)
)
(1): LinearDecoder(
(decoder): Linear(in_features=400, out_features=6408, bias=False)
(dropout): LockedDropout()
)
)

My goal is to put in a sequence and get a feature vector to do e.g. cos similarity between sequences. Here’s code I found on this forum to accomplish this:

h = m[0](V(T([tokensAsNumbers))) #convert array of tokensAsNumbers to tensor then variable with grad=false. Run that through the rnn encoder on m[0], m is a pytorch model. m[1] has decoder. 

feature_vec = to_numpy(h[0][2][0][2])

What’s the thinking behind h[0][2][0][2]? I could see how the 2s could correspond to the 2s in the repr string above, but not the 0s. What’s the logic here? Is there a better way to get feature vectors like this from pytorch models?

A better way to do this is:

nodropout, yesdropout = m[0](V(T([tokensAsNumbers)))
lastLayerIx = 2
feature_vec = to_numpy(nodropout[lastLayerIx].data[0,-1,:]) # where -1 corresponds to the number of words in the sequence. Returns a length 400 vector.

If I read your code correctly it will get you the hidden state for the first input word (data[0, ..] ) instead for the last one.

Here’s a complete example that gets you the hidden state for the last word:

model: SequentialRNN = <Lesson 4 or 10 model>
encoder: RNN_Encoder = model[0]

ary = np.reshape(np.array(tokensAsNumbers), (-1, 1))
hidden_states, outputs_with_dropout = encoder(V(ary))
hidden_states_last_layer = hidden_states[-1]
hidden_state_last_word = hidden_states_last_layer[-1].squeeze()
feature_vec = to_np(hidden_state_last_word.data)

what is ‘m[0]’ here?

How would I do this to get output from the last hidden layer of a CNN before the output layer?