Get hidden states from LSTM and Transformer encoders

alexyalunin · April 13, 2019, 1:17pm

Hi, I want to train an autoencoder based on LSTM and Transformer, eventually I want to get hidden states after the encoder part. There are lots of implementations on the web using the encoder-decoder structure of them, however, on the models of Fastai an encoder represents an Embedding matrix like here transformer and here awd-lstm. How do I get hidden states from encoders from these models?

sgugger · April 13, 2019, 5:55pm

Not too sure what you’re asking: in fasti the core models Transformer and AWD_LSTM return the hidden states and even the final model returns output,raw_hiddens,hiddens to enable AR/TAR regularization.

alexyalunin · April 13, 2019, 6:09pm

As far as I understood these models are language models, e.i. they return hidden states and calculate derivatives for them for each step. I’m looking for lstm and transformer with encoder-decoder architecture like for machine translation.

sgugger · April 13, 2019, 6:10pm

There are none in fastai. You will have to build them yourself or take them from other repos.

alexyalunin · April 13, 2019, 11:07pm

Thank you for the reply. Last question: is it even possible to train such a model with fastai? I mean to train such models efficiently I would need to provide sourse masks and target masks for the batches. I assume that fastai learner accepts only 2 parameters (x,y) in TensorDataset not (x, y, src_mask, trg_mask).

sgugger · April 14, 2019, 12:03am

x and y can be lists of tensors, so I don’t think there is a problem here.