Hello @jolackner. I’m impressed by your “2 hours by epoch on a P4”!
Could you tell us more about your data and parameters ? (size of vocab.itos and vocab.stoi, size of your dataset used to create your databunch, batch size, drop_mult in your learner…). Thank you.
Note: I asked as well yesterday a question relative to the dataset size to be used for training in the Language Model Zoo thread.