Has anyone run the ULMFit training scripts in courses/dl2/imdb_scripts or courses/dl2/imdb.py? If so, roughly how long did
finetuning the language model
training the classifier
take? I’m running on a Pascal TITAN X, and the finetuning step is taking a long time (~5 hours or so) – so just wanted to see if there’s anything obviously misconfigured on my machine.
I’ve got 95.1% acc using single 1080ti, I didn’t time it but I guess it was inbetween 2-3 hours. I skipped learning rate finders though.
I made some modifications right away to speed up calculations:
max_vocab = 30000 #60000
language model: bs=128 #52
classifier: bs = 64 #48
Do you have a run script or a fork I could look at to reproduce those results? I get a number of errors when I run the imdb_scripts code – bad paths, missing arguments, etc.
Hey! I’m trying to work through the imdb.ipyn right now and would love any feedback or pointers.
I have set up an account at Google Cloud, and am running an NVIDIA k80 GPU with them.
Right now I’m running the first cell where we are fitting the model:
learner.fit(lrs/2, 1, wds=wd, use_clr=(32,2), cycle_len=1)
and it’s only going at about 1 iter/sec, and my volatile-gpu-util is almost at 100%. Would anyone know if this is literally how much the GPU can handle, or if there’s anything I can do to hurry up things along?
Hi! I just discovered this library and jumped right in (I intend to take the class later when I have some time). Given this fact, and the fact that I used the AWS DeepLearning AMI rather than the fastai AMI may mean that I have missed some system setup which would improve the performance I’m seeing. At this time, training on a corpus of around 250 million tokens, it takes 8.5 hours per epoch.
I would love to know if this is sounds reasonable, and if not, what I could do to improve it.
Below are some relevant details.
I’m using the ULMFiT model that was presented in the iMDB notebook (an AWD LSTM model pretrained on wiki103, bs=52, bptt=70, embedding dim=400, hid size=1150, 3 layers).
I’m training in a Jupyter notebook on AWS EC2 instance using the DeepLearning Ubuntu AMI on 1 p2.xlarge instance (which has NVIDIA K80 GPU, 4 vCPU, 61 GiB RAM).
My vocab size is 50,000.
My pre-tokenized corpus consists of 247,289,534 tokens. No cleaning/tokenization is done during training.
There are 67,935 iterations/batches in an epoch. Each epoch = 8.5105 hours.
Each iteration takes 0.45 sec.