Just wanted to share this interesting paper I was reading about neural text generation. Having trained a bunch of models for a Bengali language generation - summarization - translation project I came across this a lot. Making intelligent sentences was not the easiest thing to do and I was at a loss on how to improve on this.
The paper goes through likelyhood maximization and more random-generation processes and also details their own contribution, called Nucleus Sampling that seems to be a sort of dynamic top k method, where the distribution is actually changing as we sample.
Thought this might be interesting to any of you, and maybe fast.ai itself given that NLP is a big reason why a lot of us started with it.