30GB Too Slow! Fine Tuning IMDB language model. 10 NLP

In Chapter 10 NLP “Fine Tuning the Language Model” it shows Jeremy’s learn.fit_one_cycle taking 11minutes.

Mine takes 1hr 45 minutes and I had to downsize the ‘batch size’ = 64 and ‘sequence length’ =40 just to avoid getting CUDA OOM error.

I’m using a paperspace 30G GPU.

I sense that something is wrong. Does anyone know why it’s taking so long or where I could look for tips on speeding it up?

any help at all would be much appreciated

Screen Shot 2021-08-04 at 10.29.24 AM

1 Like

I am not sure if the 30G is GPU memory. It could be probably CPU memory. I tried running the same code on RTX 5000 in jarvislabs.ai, it took approximately 13.46 seconds.

1 Like

Since you are getting a CUDA OOM error, it sounds like the model is being trained on the GPU and not via CPU.

I’m not seeing a Paperspace 30G GPU on their instance list, but if it’s one of the cheaper GPUs, perhaps a M4000 or P4000, those both have significantly less RAM and compute power then the Titan RTX which the lesson was ran on.

If that’s the case, you’d have better luck using Colab or Kaggle. Kaggle’s P100s are still a lot slower than the Titan RTX, but should be leaps and bounds better than a M4000 or P4000. Likewise a T4 on the free tier of Colab will be slower but also should be an improvement, especially if you used Mixed Precision.

1 Like

Thanks so much for replying. I might try Kaggle or Colab next.

Currently, it’s showing that I’m using the Free P5000 (30GB, 8CPUs)
Screen Shot 2021-08-09 at 9.38.49 AM

before I was using the M4000.

Update!

About 3 weeks ago I splurged and started paying $8 per month to get access to more GPUs

Currently I find the Free-RTX4000 30GPU Ram | 8CPUs to be very fast and almost always available.

In case anyone is looking to upgrade I found this simple and well worth it.