Language Model - getting access to encoder embeddings


I was wondering if there is a way to get access to the encoder embedding after training the model. Currently I’m trying to use the model on an unsupervised case, meaning loading the encoder to a classifier model (as in the class notebook) won’t do. Is there a way to simply extract that information from the saved encoder?

Hi Theodore,

first load the encoder (by running learner.load(…)), and then extract the weights from the first layer like this:

m = learner.model
layers = list(m.children())




Thanks a lot for the code, will give it a try!

Say, what would the most efficient way of doing it if I’ve already closed the session (but saved the encoder on disk)? Would I need to recreate the md object through the LanguageModelData command?

I seem to have managed using the torch.load function on the model and then accessing the 0.encoder.weights object and used the same command to load the encoder weights!

I am sorry for the barrage of messages, I’m really new in pytorch and fastai and some things are not so clear. I will probably go read the documentations after this but wanted to add a question nonetheless.

Apart from the ‘encoder.weight’ matrix that I extracted there is another matrix called ‘encoder_with_dropout.embed.weight’. I’m not really clear what this is or if I should be extracting this one instead. Does the model save two versions of this?

Again, apologize if these are obvious to most of you.

Hi @MatthiasBachfischer I am looking for the embedings for a text description from the pretrained model , can you help me with that . I am new to fastai and pytorch…?