Wiki: Lesson 4

I think I figured this out. It’s not the model that’s the issue, it’s how the top prediction is chosen. You want to use torch.multinomial instead of torch.topk.

I just made a post about it here