No CUDA-capable device is detected

my notebook shows this message under the 4th cell, the AWS instance is a p2.large configured with the script provided.
Am I missing some configuration?

Using Theano backend.
WARNING (theano.sandbox.cuda): CUDA is installed, but device gpu is not available (error: Unable to get the number of gpus available: no CUDA-capable device is detected)

2 Likes

What is the output of the command

nvidia-smi

on the command line on your instance?

(I don’t know much about GPUs, but that command should give you some information about the recognized nvidia card, and might give you a helpful error message if there’s something wrong.)

yes, there seems to be something wrong, the output is below:
ubuntu@ip-10-0-0-10:~/as/repos/DL/nbs/data/redux$ nvidia-smi
Failed to initialize NVML: Driver/library version mismatch

this is fixed now, running on GPU!

Glad it’s working now! I got that error last night, and then it worked fine in the morning for me (I stopped and re-started my instance in the meantime)

2 Likes

How did you fix it? Did it somehow fix itself?

@jeff I had the same issue and it worked by restarting the instance and checking the nvidia-smi command

2 Likes

Same issue - and a quick reboot fixed it.

Generally speaking you should find that ‘sudo modprobe nvidia’ fixes most problems that would otherwise need a reboot. Just a little shortcut - nothing wrong with rebooting, of course.

1 Like

I experienced the same issue after “sudo apt-get update && sudo apt-get upgrade” and a reboot fixed it for me, too.

4 Likes

I had to follow what Jeff did to resolve the issue.

Interesting, it seems like some apt-installs can disable the ndvidia module. Weird. I had this problem and Jerem’ys suggestion nailed it.

Just to add - I created a new p2 instance (Ireland) and nvidia-smi failed with Failed to initialize NVML: Driver/library version mismatch. However, the suggested simple fix sudo modprobe nvidia did not help. Rebooting the instance fixed the problems.

As this was rather confusing, and given that the ~tutorial~ setup video otherwise just “worked perfectly”, I hope this thread helps other newcomers get around this issue.

5 Likes

Output of nvidia-smi is command not found on my p2 instance.

sudo modprobe nvidia didn’t fix my issue either

If this is first time you’ve run scripts etc in setup, it does say you should reboot instance.

So try ‘sudo reboot’

‘nvidia-smi’ worked fine for me afterwards

1 Like