When I was trying to re-implement lesson 6 using a
Learner from a
LanguageModelData object my predictions kept outputting an infinite loop. I’ve seen some forum posts about the same issue as well.
The short answer is to replace
n=res[-1].topk(2) n = n if n.data==0 else n res,*_ = m(n.unsqueeze(0))
r = torch.multinomial(res[-1].exp(), 2) r = r if r.data == 0 else r res, *_ = m(r.unsqueeze(0))
It looks like models can often converge to having the same predictions come out on top over and over again, and the added variability of using
torch.multinomial rather than
torch.topk can avoid this. I had the same issue with word level models like in Lesson 4, and the same fix works.
For more detail I wrote up a quick notebook on how to implement this for word level and character level models.