Py3 and tensorflow setup

RogerS49 · May 2, 2017, 7:42am

For GTX 1080TI with cudnn 6.0.20 and Cuda 8.0.61 and driver 378.13

Simply added Anaconda 3 and Tensorflow

DID NOT sudo update, upgrade any drivers.
DID NOT change cudnn-6.0.20

Faced with this error

Import error: libcudnn.so.5 : cannot open shared object file: no such file or directory

see this

http://stackoverflow.com/questions/42013316/after-building-tensorflow-from-source-seeing-libcudart-so-and-libcudnn-errors

and this if you have 2 GPU’s and wrong one is seen

http://stackoverflow.com/questions/37893755/tensorflow-set-cuda-visible-devices-within-jupyter

and this

http://stackoverflow.com/questions/41965187/nvidia-device-error-in-tensorflow/41975926#41975926

So far I have not tried it beyond running the Keras-Tensorflow_Tutorial.ipynb. At this time I have issues with printing the accuracy from the model (added as per the live Tensorflow Mnist tutorial) a learning issue, but the main aim here was to simple get from Theano to Tensorflow for anyone with a home server and a 1080 TI

darthdeus · May 2, 2017, 11:37pm

Just a small tip, you probably don’t want the -r with rm if deleting a file. The -f is still useful though, as it will not error out if the file doesn’t exist

mribbons · September 20, 2017, 4:31am

Nice.

You can also use source activate python3 to switch to python3.

Also, you can call source ~/.bashrc instead of logging back in.

Patrick · September 23, 2017, 12:19pm

Hi,

Has anyone followed the instructions above recently? I installed tensorflow version 1.3 as this is the default version on pypi. However, it appears as if this version of tensorflow requires cuDNN version 6. I base this assumption on the ImportError shown below. The link in the first post of this thread refers to cuDNN 5.1, so I imagine there is a mismatch between what tensorflow 1.3 requires and what I installed.

Here is the error:

In [1]: import tensorflow
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
~/anaconda3/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py in <module>()
 40     sys.setdlopenflags(_default_dlopen_flags | ctypes.RTLD_GLOBAL)
---> 41   from tensorflow.python.pywrap_tensorflow_internal import *
 42   from tensorflow.python.pywrap_tensorflow_internal import __version__

~/anaconda3/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py in <module>()
 27             return _mod
---> 28     _pywrap_tensorflow_internal = swig_import_helper()
 29     del swig_import_helper

~/anaconda3/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py in swig_import_helper()
 23             try:
---> 24                 _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
 25             finally:

~/anaconda3/lib/python3.6/imp.py in load_module(name, file, filename, details)
241         else:
--> 242             return load_dynamic(name, filename, file)
243     elif type_ == PKG_DIRECTORY:

~/anaconda3/lib/python3.6/imp.py in load_dynamic(name, path, file)
341             name=name, loader=loader, origin=path)
--> 342         return _load(spec)
343 

_ImportError: libcudnn.so.6: cannot open shared object file: No such file or directory_

During handling of the above exception, another exception occurred:

ImportError                               Traceback (most recent call last)
<ipython-input-1-a649b509054f> in <module>()
----> 1 import tensorflow

~/anaconda3/lib/python3.6/site-packages/tensorflow/__init__.py in <module>()
 22 
 23 # pylint: disable=wildcard-import
---> 24 from tensorflow.python import *
 25 # pylint: enable=wildcard-import
 26 

~/anaconda3/lib/python3.6/site-packages/tensorflow/python/__init__.py in <module>()
 47 import numpy as np
 48 
---> 49 from tensorflow.python import pywrap_tensorflow
 50 
 51 # Protocol buffers

~/anaconda3/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py in <module>()
 50 for some common reasons and solutions.  Include the entire stack trace
 51 above this error message when asking for help.""" % traceback.format_exc()
---> 52   raise ImportError(msg)
 53 
 54 # pylint: enable=wildcard-import,g-import-not-at-top,unused-import,line-too-long

ImportError: Traceback (most recent call last):
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py", line 41, in <module>
from tensorflow.python.pywrap_tensorflow_internal import *
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
_pywrap_tensorflow_internal = swig_import_helper()
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/home/ubuntu/anaconda3/lib/python3.6/imp.py", line 242, in load_module
return load_dynamic(name, filename, file)
  File "/home/ubuntu/anaconda3/lib/python3.6/imp.py", line 342, in load_dynamic
return _load(spec)
ImportError: libcudnn.so.6: cannot open shared object file: No such file or directory


Failed to load the native TensorFlow runtime.

I’ve installed cuDNN 6 using the following link:

gist.github.com

https://gist.github.com/mjdietzx/0ff77af5ae60622ce6ed8c4d9b419f45

waya-dl-setup.sh

#!/bin/bash

# install CUDA Toolkit v8.0
# instructions from https://developer.nvidia.com/cuda-downloads (linux -> x86_64 -> Ubuntu -> 16.04 -> deb (network))
CUDA_REPO_PKG="cuda-repo-ubuntu1604_8.0.61-1_amd64.deb"
wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/${CUDA_REPO_PKG}
sudo dpkg -i ${CUDA_REPO_PKG}
sudo apt-get update
sudo apt-get -y install cuda

This file has been truncated. show original

I removed the cuDNN 5 tgz and folder and the downloads/cuda/ folder that gets created when unzipped. I then moved the contents of the cuDNN 6 downloads/cuda/ folder (after installing and unzipping) into the /usr/… folders as instructed in the first post. However, after doing so, I still get the same error about not being able to find libcudnn.so.6 even though it is located in my /usr/local/cuda/lib64/folder.

I’m stuck here. Can anyone help?

Thanks!

Patrick · September 24, 2017, 11:34am

As an update, I’ve given up trying to get tensorflow 1.3 working and have decided to install tensorflow 1.2 which is compatible with cuDNN 5.1. I did so by running the following line in my shell:

pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.2.1-cp36-cp36m-linux_x86_64.whl

If anyone does know how to install cuDNN 6.0 and get tensorflow configured with it, please let me know.

Thanks,
Patrick

Estiui · September 25, 2017, 8:44am

I’m having the same issue, even after correcting the backslash thing in the comments. I have checked and libcudnn.so.6 seems to be in the correct folder, so I don’t know what else to do…

dredwilliams · September 29, 2017, 12:39am

The key to getting the new tensorflow to find the cudnn6.0 libraries is to either set the LD_LIBRARY_PATH variable to point to where your libcudnn.so files are located, or (like I did) put a file in the /etc/ld.so.conf.d directory with just the full path of the directory containing the shared libraries. After putting the file in the ld.so.conf.d directory, run the command ‘ldconfig’ as root and then restart your python shell and try again.

I actually installed my NVIDIA drivers and cuda libraries from the cuda repository at NVIDIA – and that installation will set up the file in /etc/ld.so.conf.d directory for you … but I only found this out when I’d given up and blew away all my GPU stuff and started over.

Good Luck!

Estiui · September 29, 2017, 11:00am

Now I have some final doubt. After all the config steps, I’ve found out that after starting jupyter notebook, my browser can’t reach the proper address (i.e. “http://localhost:8888/tree/courses/deeplearning2”). Also, it cannot reach the previous address for the deeplearning1 folder with the first part of the course, but it used to work as I did Part I months ago. What am I doing wrong?

pavan_alluri · November 18, 2017, 1:59pm

Thank you! haven’t been able to fix it but sure has gotten around the error by using using your snippet

iNLyze · November 20, 2017, 11:15pm

Occasionally, you may run into problems related to .jupyter/jupyter_notebook_config.py. You can create a new, clean configuration file using

Make sure you backup any existing ~/.jupyter/jupyter_notebook.config.py

jupyter notebook --generate-config

In this file you can also adjust quite a few useful options, such as which browser to open, etc.

P_Wes · May 9, 2018, 7:44pm

I am in the process of setting up a new machine and I am wondering if these instructions will work from scratch, or if they will only work to update an existing environment.

When I try to import tensorflow fromipython I get the following error:

ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory
...
Failed to load the native TensorFlow runtime.

After doing a little research, I realized that libcublas is somehow related to CUDA. When I run nvidia-smi, I get:

   +-----------------------------------------------------------------------------------------------------------+
    | NVIDIA-SMI 390.48                 Driver Version: 390.48                     |
    |-------------------------------+----------------------+------------------------------------------------+
    | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC              |
    | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M.    |
    |===================+======================+======================|
    |   0  GeForce GTX 106...  Off  | 00000000:01:00.0  On |                  N/A                   |
    |  4%   49C    P8     8W / 120W |    260MiB /  6075MiB |      1%      Default                 |
    +-------------------------------+----------------------+-----------------------------------------------+
                                                                                   
    +-------------------------------------------------------------------------------------------------------+
    | Processes:                                                       GPU Memory                                   |
    |  GPU       PID   Type   Process name                             Usage                                |
    |=================================================================|
    |    0      1013      G   /usr/lib/xorg/Xorg                           185MiB                                |
    |    0      1843      G   compiz                                        72MiB                                     |
    +-------------------------------------------------------------------------------------------------------+

However, when I run cuda, I get:

No command 'cuda' found, did you mean:
 Command 'crda' from package 'crda' (main)
cuda: command not found

kab · July 19, 2018, 9:41pm

Hi all. I have notebooks from part 1 from way long back that I would like to make work with Tensorflow, specifically the cats/dogs using vgg16.py. I’ve tried changing the dimensions and getting rid of Theano imports in the scripts, but am still running into dimensionality errors. Can anybody post their modified vgg16.py script and which weights are we supposed to use? Currently I’m pulling from http://files.fast.ai/models/vgg16.h5. Is this the correct weights?