I am wondering if anyone has had any success implementing teacher forcing with an AWD-LSTM text language model from fastai? Rachel does a great explanation of teacher forcing in general in the seq2seq lecture:
but she had to modify the forward function to optionally switch dec_inp depending on a random number. I do not see this in the forward definition of class AWD_LSTM hence I am wondering if this is possible.
I am surprised Jeremy hasn’t mentioned teacher forcing in the part 1 or part 2 lectures during the wiki text 103 training portions. I assumed it was best practice to train with teacher forcing when creating generative models. Is this not the case?