What is the correct method to get text features from LSTM or RNN network

invisprints · November 20, 2020, 9:26am

I am a novice in NLP. Recently I need to get feature from a sentence. I notice there are different ways to get these features from LSTM or RNN.

Get the last output from the network.
Get the last hidden from the network.
Get the output that before padding input.

Since the sentence is variable length, these features are different from each other. Method 1 and Method may contain some padding input. I wonder which one is the best to extract sentence’s features.

chudichudichudi · November 20, 2020, 3:28pm

in a seq2seq model like awdlstm, you have 2 parts, the enconder and the decoder

It usually is something like this: Encoder -> hiddenstate that goes to Decoder as h0 -> Decoder

You usually want the the last hiddentstate of the enconder as its the knowledge of the seq that you feed the model to

invisprints · November 21, 2020, 6:02am

Thank you! Let me explain what you mean with an example. In the following picture, we often

use the No.4 tensor as our encoder features. Am I right?

chudichudichudi · November 21, 2020, 5:23pm

yes exactly

I doesn’t have to be a lstm, It could be an Auto-encoder or the architecture of your choice

invisprints · November 22, 2020, 12:29am

Thanks a lot!