Get hidden states from LSTM and Transformer encoders

Hi, I want to train an autoencoder based on LSTM and Transformer, eventually I want to get hidden states after the encoder part. There are lots of implementations on the web using the encoder-decoder structure of them, however, on the models of Fastai an encoder represents an Embedding matrix like here transformer and here awd-lstm. How do I get hidden states from encoders from these models?

Not too sure what you’re asking: in fasti the core models Transformer and AWD_LSTM return the hidden states and even the final model returns output,raw_hiddens,hiddens to enable AR/TAR regularization.

1 Like

As far as I understood these models are language models, e.i. they return hidden states and calculate derivatives for them for each step. I’m looking for lstm and transformer with encoder-decoder architecture like for machine translation.

There are none in fastai. You will have to build them yourself or take them from other repos.

Thank you for the reply. Last question: is it even possible to train such a model with fastai? I mean to train such models efficiently I would need to provide sourse masks and target masks for the batches. I assume that fastai learner accepts only 2 parameters (x,y) in TensorDataset not (x, y, src_mask, trg_mask).

x and y can be lists of tensors, so I don’t think there is a problem here.

1 Like