Seq2seq multiple input features (Passing multiple word/word tokens as input)

siv · July 23, 2017, 1:43pm

Is there a way to pass extra feature tokens along with the existing word token (training features/source file vocabulary) and feed it to the encoder RNN of seq2seq?. Since, it currently accepts only one word token from the sentence at a time.

Let me put this in a more concrete fashion; Consider the example of machine translation/nmt - say I have 2 more feature columns for the corresponding source vocabulary set( Feature1 here ). For example, consider this below:

±--------±---------±---------+
|Feature1 | Feature2 | Feature3 |
±--------±---------±---------+
|word1 | x | a |
|word2 | y | b |
|word3 | y | c |
±--------±---------±---------+

To summarise, currently seq2seq dataset is the parallel data corpora has a one-to one mapping between he source feature(vocabulary,i.e Feature1 alone) and the target(label/vocabulary). I’m looking for a way to map more than one feature(i.e Feature1, Feature2,Feature3) to the target(label/vocabulary).

Moreover, I believe this is glossed over in the seq2seq-pytorch tutorial(https://github.com/spro/practical-pytorch/blob/master/seq2seq-translation/seq2seq-translation.ipynb) as quoted below:

When using a single RNN, there is a one-to-one relationship between inputs and outputs. We would quickly run into problems with different sequence orders and lengths that are common during translation…….With the seq2seq model, by encoding many inputs into one vector, and decoding from one vector into many outputs, we are freed from the constraints of sequence order and length. The encoded sequence is represented by a single vector, a single point in some N dimensional space of sequences. In an ideal case, this point can be considered the “meaning” of the sequence.
Furthermore, I tried tensorflow and took me a lot of time to debug and make appropriate changes and got nowhere. And heard from my colleagues that pytorch would have the flexibility to do so and would be worth checking out.

Also if anyone tried to work on something similar in seq2q tensorflow / pytorch / keras - please provide your valuable advice on the same.

Please share your thoughts on how to achieve the same. Would be great of anyone tells how to practically implement/get this done. Thanks in advance.