No CUDA-capable device is detected

mr.sarno2 · November 4, 2016, 7:13am

my notebook shows this message under the 4th cell, the AWS instance is a p2.large configured with the script provided.
Am I missing some configuration?

Using Theano backend.
WARNING (theano.sandbox.cuda): CUDA is installed, but device gpu is not available (error: Unable to get the number of gpus available: no CUDA-capable device is detected)

skh · November 4, 2016, 9:02am

What is the output of the command

nvidia-smi

on the command line on your instance?

(I don’t know much about GPUs, but that command should give you some information about the recognized nvidia card, and might give you a helpful error message if there’s something wrong.)

mr.sarno2 · November 4, 2016, 3:13pm

yes, there seems to be something wrong, the output is below:
ubuntu@ip-10-0-0-10:~/as/repos/DL/nbs/data/redux$ nvidia-smi
Failed to initialize NVML: Driver/library version mismatch

mr.sarno2 · November 4, 2016, 9:25pm

this is fixed now, running on GPU!

rachel · November 4, 2016, 9:38pm

Glad it’s working now! I got that error last night, and then it worked fine in the morning for me (I stopped and re-started my instance in the meantime)

jeff · November 4, 2016, 11:58pm

How did you fix it? Did it somehow fix itself?

ivijaydeep · November 5, 2016, 4:21pm

@jeff I had the same issue and it worked by restarting the instance and checking the nvidia-smi command

robin · November 7, 2016, 5:59am

Same issue - and a quick reboot fixed it.

jeremy · November 7, 2016, 2:09pm

Generally speaking you should find that ‘sudo modprobe nvidia’ fixes most problems that would otherwise need a reboot. Just a little shortcut - nothing wrong with rebooting, of course.

jeff · November 7, 2016, 3:36pm

I experienced the same issue after “sudo apt-get update && sudo apt-get upgrade” and a reboot fixed it for me, too.

pandu.ranganath · November 7, 2016, 8:45pm

I had to follow what Jeff did to resolve the issue.

matt · November 13, 2016, 8:56pm

Interesting, it seems like some apt-installs can disable the ndvidia module. Weird. I had this problem and Jerem’ys suggestion nailed it.

fnl · December 25, 2016, 10:30pm

Just to add - I created a new p2 instance (Ireland) and nvidia-smi failed with Failed to initialize NVML: Driver/library version mismatch. However, the suggested simple fix sudo modprobe nvidia did not help. Rebooting the instance fixed the problems.

As this was rather confusing, and given that the ~tutorial~ setup video otherwise just “worked perfectly”, I hope this thread helps other newcomers get around this issue.

neto71 · January 17, 2017, 6:15am

Output of nvidia-smi is command not found on my p2 instance.

Will · February 8, 2017, 6:24pm

sudo modprobe nvidia didn’t fix my issue either

WCP · February 12, 2017, 4:09pm

If this is first time you’ve run scripts etc in setup, it does say you should reboot instance.

So try ‘sudo reboot’

‘nvidia-smi’ worked fine for me afterwards