Lesson 1, Setup, Cudnn too new, keras expects 5 and cannot find 6

dllearner · May 28, 2017, 1:51am

I am running on a local machine in a conda virtual enviornment. I have CUDA 8 installed and cuDNN 6 for cuda8. I don’t want to downgrade if I don’t have to. I understand that the fast.ai lessons rely on python 2, so I have that configured in the conda env that I launch jupyter notbook from.

I tried with tensorflow installed as the back end, which had the same base error, that it couldn’t find cuDNN 5, whih makes sense because I have version 6. I tired to run theano in case it didn’t need cuDNN, but now not only does it still not find the right cuDNN, it doesn’t actually use theano?

I create a .keras/keras.json and theanorc files as shown in the videos.

those files are

{
	"image_dim_ordering": "th",
	"epsilon": 1e-07,
    "floatx": "float32",
    "backend": "theano"
}

and

[global]
floatX = float32
device = gpu0

[nvcc]
fastmath = True

[cuda]
root=/usr/local/cuda/

the jupyter notebook error is from line:

from imp import reload
import utils; reload(utils)
from utils import plots

and is:

WARNING (theano.sandbox.cuda): The cuda backend is deprecated and will be removed in the next release (v0.10).  Please switch to the gpuarray backend. You can get more information about how to switch at this URL:
 https://github.com/Theano/Theano/wiki/Converting-to-the-new-gpu-back-end%28gpuarray%29

Using gpu device 0: Quadro M4000 (CNMeM is disabled, cuDNN 6021)
/home/user/.conda/envs/py2/lib/python2.7/site-packages/theano/sandbox/cuda/__init__.py:631: UserWarning: Your cuDNN version is more recent than the one Theano officially supports. If you see any problems, try updating Theano or downgrading cuDNN to version 5.1.
  warnings.warn(warn)
Using TensorFlow backend.

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-4-383734b7678b> in <module>()
      1 from imp import reload
----> 2 import utils; reload(utils)
      3 from utils import plots

/home/user/Documents/fastai/courses/deeplearning1/nbs/utils.py in <module>()
     31 from theano.tensor.signal import pool
     32 
---> 33 import keras
     34 from keras import backend as K
     35 from keras.utils.data_utils import get_file

/home/user/.conda/envs/py2/lib/python2.7/site-packages/keras/__init__.py in <module>()
      1 from __future__ import absolute_import
      2 
----> 3 from . import activations
      4 from . import applications
      5 from . import backend

/home/user/.conda/envs/py2/lib/python2.7/site-packages/keras/activations.py in <module>()
      2 import six
      3 import warnings
----> 4 from . import backend as K
      5 from .utils.generic_utils import deserialize_keras_object
      6 from .engine import Layer

/home/user/.conda/envs/py2/lib/python2.7/site-packages/keras/backend/__init__.py in <module>()
     71 elif _BACKEND == 'tensorflow':
     72     sys.stderr.write('Using TensorFlow backend.\n')
---> 73     from .tensorflow_backend import *
     74 else:
     75     raise ValueError('Unknown backend: ' + str(_BACKEND))

/home/user/.conda/envs/py2/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py in <module>()
----> 1 import tensorflow as tf
      2 from tensorflow.python.training import moving_averages
      3 from tensorflow.python.ops import tensor_array_ops
      4 from tensorflow.python.ops import control_flow_ops
      5 from tensorflow.python.ops import functional_ops

/home/user/.conda/envs/py2/lib/python2.7/site-packages/tensorflow/__init__.py in <module>()
     22 
     23 # pylint: disable=wildcard-import
---> 24 from tensorflow.python import *
     25 # pylint: enable=wildcard-import
     26 

/home/user/.conda/envs/py2/lib/python2.7/site-packages/tensorflow/python/__init__.py in <module>()
     49 import numpy as np
     50 
---> 51 from tensorflow.python import pywrap_tensorflow
     52 
     53 # Protocol buffers

/home/user/.conda/envs/py2/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow.py in <module>()
     50 for some common reasons and solutions.  Include the entire stack trace
     51 above this error message when asking for help.""" % traceback.format_exc()
---> 52   raise ImportError(msg)
     53 
     54 # pylint: enable=wildcard-import,g-import-not-at-top,unused-import,line-too-long

ImportError: Traceback (most recent call last):
  File "/home/user/.conda/envs/py2/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow.py", line 41, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/home/user/.conda/envs/py2/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/home/user/.conda/envs/py2/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
ImportError: libcudnn.so.5: cannot open shared object file: No such file or directory


Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/install_sources#common_installation_problems

for some common reasons and solutions.  Include the entire stack trace
above this error message when asking for help.

I also tried making a sym link:

Try 2:
If however I try to use python 3 instead, which as an up to date tensorflow etc and shouldn’t ask for cudnn 5, I get the cPickle error which is part of python2 only (or so forum conversations have led me to believe).

Try3:
I have even tried using a new conda env that ONLY has theano installed (no tensorflow) and even setting “KERAS_BACKEND=theano” from bash prior to starting, but it still trys theano partially crashes, says it is using tensorflow backend, then crashes saysing no tensorflow.

I just want to join this cool deep learning club and just can’t get started…

manu · July 28, 2017, 8:51pm

It’s been some time…any luck there? I’m in the boat

manu · July 29, 2017, 6:49am

I got it working. I had to downgrade cudnn to version 5.1. The latter is still available in the Nvidia webpage after you sign up.

andrii · August 14, 2017, 6:00pm

Hi guys!

I’m a newbie and just started the course. I’ve executed “install-gpu.sh” and run Jupyter Notebook. But I also got a “WARNING (theano.sandbox.cuda) …”. And at the moment I can’t get what to do next. I can’t realize how to downgrade cuDNN (if it is needed) and what to do in general.

Can you please help?

Ptilulu · August 15, 2017, 12:22pm

Hi Manu! Could you describe what you did to downgrade cudnn to version 5.1? Thanks!

manu · August 15, 2017, 12:49pm

Hi,

I’m using Gentoo Linux (not sure if you are familiar) and I simply renamed the official ebuild for cudnn-6.0 to “cudnn-5.1.ebuild”, put it in my local “portage” and, aside, downloaded “cudnn-8.0-linux-x64-v5.1.tgz” from Nvidia webpage. Probably not very useful if you are using some another OS/Linux distribution, but anyway, for what’s worth.

Cheers.

carlosdeep · September 15, 2017, 2:50pm

I had a similar issue but got fixed. I just updated line 62 in dnn.py from ‘cudnn64_5.dll’ to ‘cudnn64_6.dll’, that is the dll name for cuDNN 6.0
Cheers.

harryh · November 23, 2017, 9:18am

With the most updated Theano version 1.0.0, I was able to work with cudnn 7.0.4 + cuda 9.0.176.

conda install -c mila-udem theano pygpu

And the combination also worked with Tensorflow 1.4.0.

manu · December 4, 2017, 8:09pm

Harry,

could you please post your setup, e.g.,

conda list

and also your .theanorc. Do you also need to run jupyter setting

MKL_THREADING_LAYER=GNU

?

I’m getting the error

Can not use cuDNN on context None: cannot compile with cuDNN. We got this error
/tmp/try_flags_Mma0Dp.c:4:19: error fatal: cudnn.h: No existe el fichero o el directorio
#include <cudnn.h>

It’s probably a mess with the versions but who knows… I’m using cudnn-9.0-linux-x64-v7.tgz and cuda-9.0.176

Cheers.

PS: it used to work for me with the old versions but Gentoo has just dismissed gcc5, and that broke my setup.

harryh · December 5, 2017, 12:34am

My bad. I should have mentioned that I am on Windows 7. Also, theano 1.0 works fine with gpu on my pc, but still awaiting tensorflow 1.4(or later) to go with windows+gpu.

tensorflow-gpu            1.4.0                     <pip>
tensorflow-tensorboard    0.4.0rc3                  <pip>
testpath                  0.3                      py36_0
theano                    1.0.0            py36hd53c938_0    mila-udem
Theano                    0.9.0                     <pip>

MKL_THREADING_LAYER=GNU
THEANO_FLAGS=floatX=float32,device=cuda0,optimizer_including=cudnn,gpuarray.preallocate=0,dnn.conv.algo_bwd_filter=deterministic,dnn.conv.algo_bwd_data=deterministic,dnn.include_path=d:/toolkits.win/cuda-9.0.176/include,dnn.library_path=d:/toolkits.win/cuda-9.0.176/lib/x64

manu · December 6, 2017, 8:50am

Got it working Thank you so much!!

PS: are you using python3 (py36) in the first part of the course? I thought it was only meant to work in python 2/keras 1.2

harryh · December 6, 2017, 9:21am

Thanks to https://github.com/philferriere/dlwin and https://github.com/roebius/deeplearning_keras2, I have just been able to finish Lesson 1 to 7 notebooks with python 3.6 and keras 2.

manu · December 6, 2017, 10:22am

Good to know about that one

Thanks!!

skbisoi · March 21, 2018, 1:44am

Hi…Harry…
I am using Azure machine and connecting to the same azure VM Deep learning VM through Git Bash Shell .

While running the jupyter notebook i am getting issue like

So to avoid this i downgraded CUDA to 8.0 version and i did not get that errror

but i am getting this warning…

skbisoi · March 21, 2018, 1:46am

Hello the current setup is telling to use python 2.7 …are u using python 3?
IF u are using python 3 then tell me how to upgrade to all dependencies related to python 3.I am currently in python 2.7.I am afraid it will mess up all the stuff.Please give some clear info related to python 3 environment related to this course.

harryh · March 21, 2018, 2:50am

This is how I installed my theano package.

conda install -c mila-udem/label/pre theano

And this is why I simply ignored the h5py FutureWarning message.

https://stackoverflow.com/a/48774337