GPU Utilization jumps to zero often

I’m trying out the new v1 library on this places dataset.

The images are not very big (they’re not all the same size but less than 1000x1000). My GPU utilization keeps fluctuating between 0 and 99. I assume that this is because the CPUs are a bottleneck. I have tried using fewer image transforms by using

get_transforms(do_flip=False, max_lighting=None, max_rotate=None, max_warp=None)

but still see the fluctuating GPU utilization. If it’s not the transforms causing the bottleneck, it might be the resize operations.

IIRC, in fastai v0.7 there was an option to resize images beforehand once and use those instead of resizing. I think this might be a solution, but couldn’t find an easy way to do this in the v1 library. Is there a way to do this?

You can always use, beforehand

from PIL import Image
ns = 200
Image.Open(filename).resize((ns, ns))

where ns is the desired new size, then save the images, etc.

2 Likes

What platform are you on? It may be you need an optimized jpeg lib.

I’m on GCP’s Deep learning VM with an Intel CPU (8 cores) and a Nvidia GPU.

I tried to drop in Pillow-SIMD instead of Pillow but it broke something, so I reverted to standard Pillow.

I will try and see if I can replicate behaviour with other datasets.

@viraat How did you solve for this? I am also facing a similar problem. My GPU utilization switches between 0 and 50% but also my CPU utilization is~10-15% not sure where the problem is. Is this in the data loader or some other internal image aug?

1 Like

You could try increasing the num_workers to 16 when you’re creating a DataBunch object. I would also try increasing the batch_size.

Let me know if that helps.

2 Likes

My system has 8 CPU’s.Is there any reason you think increasing it to 16 will be better?

It’s a general rule of thumb that I follow to use 2 x the number of CPU cores. If you have more threads doing the job of fetching data it should help.

If your images are large (> 500x500) I would consider resizing your images to something reasonable and storing them on disk and using those resized ones instead

3 Likes

Thanks Viraat ! This did help another of my observation is my volatile GPU utilization fluctuates a lot between 0% and 80% so I believe data-loader is a bottle neck I guess trying to optimize data-loading to GPU (by storing the resized images and may be not augmenting) may lead to better GPU utilization

2 Likes

What command did you use here?

You can type nvidia-smi on the command line to see GPU stats like temperature, memory utilization, power consumption and others. To have one that refresh every 2 seconds, you can type watch nvidia-smi.

Ah, great! Yeah, I already knew the nvidia-smi, but no watch nvidia-smi.

Thanks a lot for your prompt reply :).

You’re welcome :slight_smile: You can use watch with other commands too, it’s a bash command.

Ahhhhhhh, didn’t notice it :slight_smile: . Thanks again.