How to check your pytorch / keras is using the GPU?

As we work on setting up our environments, I found this quite useful:

To check that torch is using a GPU:

In [1]: import torch

In [2]: torch.cuda.current_device()
Out[2]: 0

In [3]: torch.cuda.device(0)
Out[3]: <torch.cuda.device at 0x7f2132913c50>

In [4]: torch.cuda.device_count()
Out[4]: 1

In [5]: torch.cuda.get_device_name(0)
Out[5]: 'Tesla K80'

To check that keras is using a GPU:

import tensorflow as tf
tf.Session(config=tf.ConfigProto(log_device_placement=True))

and check the jupyter logs for device info.

One could argue that ‘seeing’ a GPU is not really telling us that it is being used in training, but I think that here this is equivalent. Once a library sees the GPU, we are all set.

13 Likes

Is there a way to pull up utilization for the GPUs? I tried looking but must have missed it.

I imagine at some future function that goes “aha the GPU is only 40% utilized, I can increase my batch size by X”

1 Like

That is an interesting question and would be interesting to research. I do not have an answer myself.

I use nvidia-smi -l to see what my GPU is up to but this only gives you basic information. I think that newer versions of keras started preallocating all the GPU memory, so it doesn’t tell you much, but with torch you can see how much GPU memory is utilized by your process, that can be quite handy.

There are also more considerations that go into the batch size, so making it a heuristic to always stick as many training examples into a batch might not be ideal. Still, I think for this purpose - to have a cursory glance at model size in GPU given batch size and other parameters - nvida-smi seems to work quite nicely for torch at least based on my 2 hrs or so of experience with torch thus far :wink:

2 Likes

nvidia-smi

Shows GPU memory used (you can increase batch size if not fully utilized), GPU utilisation and the processes using the GPU.

nvidia-settings -q GPUCurrentClockFreqs

Shows the frequency with which the GPU is operating. Should be at peak spec speed when training (steps down if GPU heats up beyond design threshold)

Mine was running at 1.7 GHz:

3 Likes

Yeah, basically I want my nvidia-smi in jupityr. With some value stored for future reference.

I remember seeing other posts where utilization was recorded into a nice graph with Keras. But haven’t had a chance to find that yet.

1 Like

You could use watch -n 1 nvidia-smi so that the usage statistics are refreshed every 1 second.

1 Like

Yes that was a refreshing change with PyTorch. Keras (and I think it is tinkering down from Tensorflow backend) would grab the entire GPU memory by default and you would not know what is the actual usage.

2 Likes

Also, Tensorflow’s default setting is allocating full GPU memory during the run. Hence, with Keras and TF as backend, you’re most likely to see ~100% memory being allocated. This default behavior can be changed by @jeremy’s tip here (Tip: Clear tensorflow GPU memory).

PyTorch is nicer on this aspect. Allocating just enough memory as it needs.

Edit: @anandsaha just beat me to my comment !

2 Likes

nvidia-smi dmon is much more helpful for seeing how well your GPU is utilized. Look the the ‘sm’ column there.

16 Likes

Let’s try to stick to discussing the modules we’re using in this course - i.e. Pytorch - since otherwise it’ll get pretty confusing!

3 Likes

Can someone let me know their modules and driver versions…
I have downloaded them thrice but they are always incompatible as reported by the installer.(cuda)

I am not sure what you are looking for. But if you want to install CUDA Driver - Start here: https://developer.nvidia.com/cuda-downloads

Choose your OS, Architecture (x86_64 mostly) and you will get the file to download.

For Pytorch installation go to pytorch.org Most likely you might want the Cuda 8.0

Finally for fast.ai, you will git clone https://github.com/fastai/fastai.git. But since it will be updated with new material before each class, you will need to do a git pull each time.

These are very general pointers. Hope they are useful.

1 Like

The problem is fixed now…

Stumbled upon that post it seems. Here is the notebook which does it (plots gpu utilization in notebook itself.).

5 Likes

Awesome thanks for finding that!

I like using this:

watch -n 0.5 nvidia-smi

I run this in a tmux pane and it automatically refreshes the output nvidia-smi

4 Likes

watch is a great. You can also do:
watch -n 1 free -m to track the regular memory usage while running a model…

4 Likes

cool. put it in another tmux pane. beautiful!

I am using this to glance at the cpu/gpu utilization from desktop.

If i have 3 gpus how do i tell it to use gpu 2 in pytorch?
thanks