Beam Search / Prediction Bug in Text Learner

I believe that there are a few issues with the implementation of beam search.

  1. The first is that the model is not put into eval mode. This is also true for learner.predict. Is this intentional?

  2. The more important issue has to do with the beam search algorithm itself. The way that it is currently implemented, when top_k is equal to beam_sz, the algorithm returns top_k copies of the same string. Please note that you will only see this behavior if you first put the model in eval mode – otherwise dropout will cause random fluctuations in the probabilities. As I understand this implementation, top_k controls how many child tokens from any one particular path down the search tree should be considered during the selection of the top beam_sz paths. If top_k is set to the size of the vocabulary, it should therefore have no influence on the beam search algorithm and you would expect the algorithm to find beam_sz different paths by the time the inner loop is complete. However, as currently implemented it will only return beam_sz copies of one unique path.

The fix is as simple as commenting out the line that initializes xb to have top_k copies:

    def beam_search(self, text:str, n_words:int, no_unk:bool=True, top_k:int=10, beam_sz:int=1000, temperature:float=1.,
                    sep:str=' ', decoder=decode_spec_tokens):
        "Return the `n_words` that come after `text` using beam search."
        ds = self.data.single_dl.dataset
        self.model.reset()
        self.model.eval()
        xb, yb = self.data.one_item(text)
        nodes = None
        # xb = xb.repeat(top_k, 1)
        nodes = xb.clone()
        scores = xb.new_zeros(1).float()
        ...

To test this out, I started with the tutorial.inference.ipynb notebook. I added a method called beam_search_v2 in which I added the change to put the model in eval mode, commented out the line above, and return the list of beam_sz paths found at the end of the algorithm rather than returning a random sample from the beam_sz paths. The code is:

    def beam_search_v2(self, text:str, n_words:int, no_unk:bool=True, top_k:int=10, beam_sz:int=1000, temperature:float=1.,
                    sep:str=' ', decoder=decode_spec_tokens):
        "Return the `n_words` that come after `text` using beam search."
        #import pdb; pdb.set_trace()
        ds = self.data.single_dl.dataset
        self.model.reset()
        self.model.eval()
        xb, yb = self.data.one_item(text)
        nodes = None
        #xb = xb.repeat(top_k, 1)
        nodes = xb.clone()
        scores = xb.new_zeros(1).float()
        with torch.no_grad():
            for k in progress_bar(range(n_words), leave=False):
                out = F.log_softmax(self.model(xb)[0][:,-1], dim=-1)
                if no_unk: out[:,self.data.vocab.stoi[UNK]] = -float('Inf')
                values, indices = out.topk(top_k, dim=-1)
                scores = (-values + scores[:,None]).view(-1)
                indices_idx = torch.arange(0,nodes.size(0))[:,None].expand(nodes.size(0), top_k).contiguous().view(-1)
                sort_idx = scores.argsort()[:beam_sz]
                scores = scores[sort_idx]
                nodes = torch.cat([nodes[:,None].expand(nodes.size(0),top_k,nodes.size(1)),
                                indices[:,:,None].expand(nodes.size(0),top_k,1),], dim=2)
                nodes = nodes.view(-1, nodes.size(2))[sort_idx]
                self.model[0].select_hidden(indices_idx[sort_idx])
                xb = nodes[:,-1][:,None]
        return nodes

You can then test the number of unique paths returned by the beam search algorithm by the following:

nodes = learn.beam_search_v2('This is a simple test of', n_words=40, beam_sz=10, top_k=10)
num_unique_paths = len(set([tuple([x.item() for x in n]) for n in nodes]))
print(f"There are {num_unique} nodes")

With the line commented out we get the expected 10 unique paths. Without the line commented, we get 1 unique path.

I am happy to make a pull request if others agree that there is a bug here. Thanks!

2 Likes

@ashaw I believe you made the pull request that added this change. Do you agree that there is a bug here?

I agree that there is a bug for the model in eval mode, and I’d merge any PR that fixes that. For the rest of the algorithm I’d need to sit down and think it carefully for an afternoon but I’m busy with v1.1 development right now, so I’ll take any fix people think is good :slight_smile:

1 Like