# How to get probability of a sentence

Using ULMFit, how do I calculate the probability of a sentence?

I know how to get the most likely next word, but where can I find the probability?

``````torch.topk(res[-1], 3)  # Gives the indexes of the 3 most like words in order
w = itos[nexts[0].data[0]] # Get the most likely word
``````

I think this is some kind of word score thing, but it doesn’t look like a probability:

`res[-1][nexts[0].data[0]]`

For example:

`print(res[-1][nexts[0].data[0]], res[-1][nexts[1].data[0]], res[-1][nexts[2].data[0]])`

gives: 18.6216, 9.8292, 8.5659

That makes sense, but how to I convert these numbers to probabilities?

I would like also to know how we can get the probability of each words, since I need it for the beam searching. If res[-1] is really some kind of word scoring, maybe we could use softmax to convert the values to probability, something like: nn.Softmax(res[-1]), but since I am a beginner in PyTorch, I am not sure if it is correct or not.

1 Like

I would also like to know how it works.

This sounds like it makes sense to me. I don’t know Pytorch well either. Shouldn’t the softmax be over all possible words somehow?

yes, nn.Softmax(res[-1]) will calculate the probability of all words in vocabulary since res[-1] still contains values of all words. It is difference than the result of torch.topk(res[-1], X) where it contains only X biggest probability and its index.

I just read also similar question somewhere else: https://discuss.pytorch.org/t/how-to-extract-probabilities/2720/11

So then we can use a script something like the final one on https://nlpforhackers.io/language-models/ to generate a final probability, right?

I’m going to try this anyway.

This seems to work. Thanks @cahya. @Boone you might be interested.

Any obvious mistakes?

``````def calc_prob(model, text):
texts = ['xbos xfld 1 ' + text]
tokens = Tokenizer().proc_all_mp(partition_by_cores(texts))

# initilize probability to 1
prob = 1.0
aggregated_token_indexes = []

# we are only dealing with one sentence, so only look at tokens[0]
for token in tokens[0]:

# get the index of the token
token_idx = stoi[token]

# don't predict on a zero-length array
if len(aggregated_token_indexes) > 0:

# we want a [x,1] array where x is the number
#  of words inputted (including the prefix tokens)
ary = np.reshape(np.array(aggregated_token_indexes),(-1,1))

# turn this array into a tensor
tensor = torch.from_numpy(ary)

# wrap in a torch Variable
variable = Variable(tensor)

# batch size of 1
m[0].bs=1

# make sure we are in evaluation mode
m.eval()
m.reset()

# predict what word comes next, based on the text BEFORE this current word
res,*_ = m(variable)

# res[-1] contains the scores of each possible token in the vocabuary
# use softmax to turn it into a probability
all_token_probs = F.softmax(res[-1]).data

# find the probability of this token and multiply by the probability of all the previous text
prob *= all_token_probs[token_idx]

# aggrgate this token on for the next loop
aggregated_token_indexes.append(token_idx)

return prob``````
1 Like

Hi, thanks for the code, I will try it