How to only use encoder for predicting?

zax · November 9, 2018, 12:19am

I fine tuned my language model, but I want to only export that 400Dim vector for each query, I don’t want any classification or any other tasks. How should I use my LM to calculate this 400Dim each time?

sgugger · November 9, 2018, 1:16am

Just get your data through model[0] and save the results.

zax · November 9, 2018, 9:40am

tnx for the prompt answer.
I tried this:

enc = learn.model[0] (torch.tensor([[17]]).cuda())

It returned a tuple with a length of 2, which they are identical. I used enc[1][2] because that was the only output with a length of 400. I want to check is it the right way? which vector should I used as sentence encoder? the 3rd one? and why the model returned a tuple with the same variables?

sgugger · November 9, 2018, 1:56pm

You get a tuple of tensors that are the results of all three layers, without and with dropouts (cause both are needed for later regularization).
In your case you want the second tensor, and then the result of the last layer so enc[1][2] as you said (or enc[1][-1]).

They shouldn’t be exactly the same if you have dropout.