How to get probability of a word given context in Language model?

I train a language model:

learn_generate = language_model_learner(wiki_data, AWD_LSTM)
learn_generate.unfreeze()
learn_generate.lr_find(1)
learn_generate.fit_one_cycle(1, 1e-2, moms=(0.8, 0.7))

Now , I want to get the probabilty of a word in a context. So it should me something like:
learn_generate.get_probability(context = "The car is", word = "red)

What is the best way to do so?

You need to use predict that will return to you the mos likely word, its index and (most importantly) the tensor of probabilities. Then pick the index of the word “red” (correspondence is in wiki_data.vocab.stoi for strings to ids).

@sgugger So it would be :

TEXT = “The car is”
WORD = “red”
learn_generate.predict(TEXT, N_WORDS)
p = learn_generate.vocab.stoi[WORD]

?

Try it and see what you get.

@sgugger Please notice predict returns a string and not any tensor of probabilities:

https://docs.fast.ai/text.learner.html#LanguageLearner.predict

image

Forgot about that. You should copy and paste the code from Learner.predict then, to be able to get what you want.
That makes me think it’s not ideal we replace predict in this case, this should maybe have another name.

@sgugger Sorry but I am not sure what you meant - copy the source code?
this part?

def predict(self, text:str, n_words:int=1, no_unk:bool=True, temperature:float=1., min_p:float=None, sep:str=' ',
            decoder=decode_spec_tokens):
    "Return the `n_words` that come after `text`."
    ds = self.data.single_dl.dataset
    self.model.reset()
    xb,yb = self.data.one_item(text)
    new_idx = []
    for _ in range(n_words): #progress_bar(range(n_words), leave=False):
        res = self.pred_batch(batch=(xb,yb))[0][-1]
        #if len(new_idx) == 0: self.model[0].select_hidden([0])
        if no_unk: res[self.data.vocab.stoi[UNK]] = 0.
        if min_p is not None: 
            if (res >= min_p).float().sum() == 0:
                warn(f"There is no item with probability >= {min_p}, try a lower value.")
            else: res[res < min_p] = 0.
        if temperature != 1.: res.pow_(1 / temperature)
        idx = torch.multinomial(res, 1).item()
        new_idx.append(idx)
        xb = xb.new_tensor([idx])[None]
    return text + sep + sep.join(decoder(self.data.vocab.textify(new_idx, sep=None)))

(from here: https://github.com/fastai/fastai/blob/master/fastai/text/learner.py#L116)
?
How would that help?

@sgugger Hi can you please help me with how to get a prediction for specific word?

Like I said, use the function predict form the basic Learner (not the one you showed that replaces it).

@sgugger Oh sorry I didn’t notice it should be from basic learner
But I still get an error:

batch = wiki_toy_data.one_item(“red”)
res = learn_generate.pred_batch(batch=batch)
raw_pred,x = grab_idx(res,0,batch_first=True),batch[0]
norm = getattr(learn_generate.data,‘norm’,False)
if norm:
x = learn_generate.data.denorm(x)
if norm.keywords.get(‘do_y’,False): raw_pred = self.data.denorm(raw_pred)
ds = learn_generate.data.single_ds
pred = ds.y.analyze_pred(raw_pred)
x = ds.x.reconstruct(grab_idx(x, 0))
y = ds.y.reconstruct(pred, x) if has_arg(ds.y.reconstruct, ‘x’) else ds.y.reconstruct(pred)

And get:


TypeError Traceback (most recent call last)

<ipython-input-23-0587895e696a> in <module>() 8 pred = ds.y.analyze_pred(raw_pred) 9 x = ds.x.reconstruct(grab_idx(x, 0)) —> 10 y = ds.y.reconstruct(pred, x) if has_arg(ds.y.reconstruct, ‘x’) else ds.y.reconstruct(pred)

3 frames

/usr/local/lib/python3.6/dist-packages/fastai/text/transform.py in <listcomp>(.0) 132 def textify(self, nums:Collection[int], sep=’ ') -> List[str]: 133 “Convert a list of nums to their tokens.” --> 134 return sep.join([self.itos[i] for i in nums]) if sep is not None else [self.itos[i] for i in nums] 135 136 def getstate(self):

TypeError: only integer tensors of a single element can be converted to an index

You don’t need the last two lines, your predictions are in pred

@sgugger But the how I can get the probability of the word “red” in a specific context?
Meaning I want to get P("red"|“The car is”) , but pred[0][data.vocab.stoi["red"] will just return p("red")?

You will need to feed to your model “The car is” and then ask for the probability of red.

That’s exactly what I need.

Did you solve it? Any chance to get your solution? Thanks

@johnsnowthedeveloper Actually I didn’t , please LMK if you have a solution

I think I solved it. I’m pasting my function below the predict (my reference):

    def predict(self, text:str, n_words:int=1, no_unk:bool=True, temperature:float=1., min_p:float=None, sep:str=' ',
                decoder=decode_spec_tokens):
        "Return `text` and the `n_words` that come after"
        self.model.reset()
        xb,yb = self.data.one_item(text)
        new_idx = []
        for _ in range(n_words): #progress_bar(range(n_words), leave=False):
            res = self.pred_batch(batch=(xb,yb))[0][-1]
            #if len(new_idx) == 0: self.model[0].select_hidden([0])
            if no_unk: res[self.data.vocab.stoi[UNK]] = 0.
            if min_p is not None:
                if (res >= min_p).float().sum() == 0:
                    warn(f"There is no item with probability >= {min_p}, try a lower value.")
                else: res[res < min_p] = 0.
            if temperature != 1.: res.pow_(1 / temperature)
            idx = torch.multinomial(res, 1).item()
            new_idx.append(idx)
            xb = xb.new_tensor([idx])[None]
        return text + sep + sep.join(decoder(self.data.vocab.textify(new_idx, sep=None)))
    
    def get_prob_of_word_in_context(self, context: str, word: str):
        self.model.reset()
        xb,yb = self.data.one_item(context)
        res = self.pred_batch(batch=(xb, yb))[0][-1]
        normalized_scores = F.softmax(res)
        index_of_word = self.data.vocab.stoi[word]
        prob_of_word_given_context = normalized_scores[index_of_word]
        return prob_of_word_given_context

first we reset the model, feed it with the context (like the predict part), and pred_batch the same way.
res is a vector in the dimension of the number of words (total vocab.stoi). Meaning it contains a distribution of score for each possible word. If we normalize it by softmax, we’ll get a probability distribution. Next, we search for the word we want, getting its index with stoi, and that’s how we get the probability for it.