When I was trying to re-implement lesson 6 using a Learner
from a LanguageModelData
object my predictions kept outputting an infinite loop. I’ve seen some forum posts about the same issue as well.
The short answer is to replace
n=res[-1].topk(2)[1]
n = n[1] if n.data[0]==0 else n[0]
res,*_ = m(n[0].unsqueeze(0))
with
r = torch.multinomial(res[-1].exp(), 2)
r = r[1] if r.data[0] == 0 else r[0]
res, *_ = m(r[0].unsqueeze(0))
It looks like models can often converge to having the same predictions come out on top over and over again, and the added variability of using torch.multinomial
rather than torch.topk
can avoid this. I had the same issue with word level models like in Lesson 4, and the same fix works.
For more detail I wrote up a quick notebook on how to implement this for word level and character level models.