ULMFiT Training Time on 250M token corpus - 8.5hr/epoch

Hi! I just discovered this library and jumped right in (I intend to take the class later when I have some time). Given this fact, and the fact that I used the AWS DeepLearning AMI rather than the fastai AMI may mean that I have missed some system setup which would improve the performance I’m seeing. At this time, training on a corpus of around 250 million tokens, it takes 8.5 hours per epoch.

I would love to know if this is sounds reasonable, and if not, what I could do to improve it.

Below are some relevant details.

I’m using the ULMFiT model that was presented in the iMDB notebook (an AWD LSTM model pretrained on wiki103, bs=52, bptt=70, embedding dim=400, hid size=1150, 3 layers).

I’m training in a Jupyter notebook on AWS EC2 instance using the DeepLearning Ubuntu AMI on 1 p2.xlarge instance (which has NVIDIA K80 GPU, 4 vCPU, 61 GiB RAM).

My vocab size is 50,000.
My pre-tokenized corpus consists of 247,289,534 tokens. No cleaning/tokenization is done during training.
There are 67,935 iterations/batches in an epoch.
Each epoch = 8.5105 hours.
Each iteration takes 0.45 sec.


  • Bill