QRNN Hyper parameters for Language modeling on PTB

march1905 · August 6, 2019, 10:14am

I tried to run Quasi RNN implementation with the same hyperparameters in the original repository (GitHub - salesforce/awd-lstm-lm: LSTM and QRNN Language Model Toolkit for PyTorch), but the perplexity in fastai implementation and original implementation is not close to each other. For example, in the original qrnn implementation, learning-rate for PBT is 30, while this learning-rate value in fastai caused the infinite perplexity. Do I miss something? Would you please give me a hand to get the same result on Penn Tree Bank dataset?