Training time on AWS p2 instance

While training dogs-cats redux competition on p2.xlarge instance noticed strange issue of runtime.
I suspect there is some issue with the setup.
“from keras import backend
Using Theano backend.
WARNING (theano.sandbox.cuda): CUDA is installed, but device gpu is not available (error: Unable to get the number of gpus available: no CUDA-capable device is detected)”

Currently it is taking 7088s for 1 epoch on p2 instance whereas I ran the same on a CPU 64gb RAM system and it executed in 5476s.

Search this forum for that error - quite a few people have had it. Generally a reboot fixes it.

Stopped the instance and restarted:
It shows now as below:
from keras import backend
Using Theano backend.
Using gpu device 0: Tesla K80 (CNMeM is disabled, cuDNN 5103)
/home/ubuntu/anaconda2/lib/python2.7/site-packages/theano/sandbox/cuda/ UserWarning: Your cuDNN version is more recent than the one Theano officially supports. If you see any problems, try updating Theano or downgrading cuDNN to version 5.

Glad it is taking around 628s for 1 epoch now !!

