Long training time lesson 1

nickcorona · April 29, 2020, 10:27pm

Hi

I’m having much longer training time than expected in lesson 1. I’m at the point where we train a resnet34 model for 4 epochs.

learn.fit_one_cycle(4)

It takes me over 5 minutes to train with an RTX 2070, whereas it takes Jeremy around 1.5 minutes. Should this be expected given my GPU?

I kept the batch size at the default of 64.

Something I noticed while training is that my GPU utilization was usually below 50% (with a lot of variation) and CPU utilization around mid 70s.

muellerzr · April 29, 2020, 10:48pm

Use a larger batch size if that’s happening then to try to utilize the GPU more. But yes, it’s very rare I match Jeremy’s GPU with the T4 etc I can get on Colab

Lim · April 30, 2020, 3:39pm

Is the gpu usage by jupyter notebook? If not, maybe try explicitly telling python to use your gpu?

nickcorona · April 30, 2020, 5:35pm

I tried doubling the batch size, it didn’t increase GPU utilization, but it did increase CPU utilization.

I’m not sure fastai is using my GPU. However, it is available to pytorch because when I type torch.cuda.is_available() it returns True.

nickcorona · April 30, 2020, 5:35pm

How do you do that?

Lim · April 30, 2020, 6:56pm

If you are running ubuntu, you can type nvidia-smi in your terminal after you run your code to see details of your gpu usage.

Also, to force fastai to use your gpu, you can do .cuda() at the end of your code.

nikhil-31 · May 3, 2020, 3:10pm

I faced a similar issue when running fast-ai on my PC running on linux 18.04. In my case, the gpu was not being used at all and the training time was about 27 mins total for lesson1 with resnet34. I have a GTX 1080

Here are the steps i followed to make it use my GPU,
1 - Make sure the driver for your GPU is installed, check this by going into your software & updates then clicking on the additional drivers tab. Select the latest version of the nvidia drivers.
Then reboot your computer, it will ask you to reboot using secure boot follow the process. reboot and check to see if it’s installed.
2. Now in your jupyter notebook run this,

import torch
print(torch.cuda.device(0))
print(torch.cuda.get_device_name(0))

The result should be something like

<torch.cuda.device object at 0x7fda2c6b7518>
GeForce GTX 1080

you might get an error here.
3. Pytorch now knows about your GPU, train your model. you can check you gpu utilization by using
watch nvidia-smi