Ubuntu install-gpu.sh setup Problem


(Robbin) #1

Hi All,

I’ve run into a lot of problems attempting to get my ubuntu server setup for lesson 1. I’ve tried completely reformatting and starting with a clean ubuntu 16.04 followed by immediately running the install-gpu.sh script.

It appears as though all of my problems are related to running Theano. In my latest attempt, I get the following error:

Python 2.7.13 |Anaconda, Inc.| (default, Sep 30 2017, 18:12:43)
[GCC 7.2.0] on linux2
Type “help”, “copyright”, “credits” or “license” for more information.

import theano
WARNING (theano.sandbox.cuda): The cuda backend is deprecated and will be removed in the next release (v0.10). Please switch to the gpuarray backend. You can get more information about how to switch at this URL:
https://github.com/Theano/Theano/wiki/Converting-to-the-new-gpu-back-end(gpuarray)

> Using gpu device 0: GeForce GTX 1070 (CNMeM is disabled, cuDNN 5103)

Traceback (most recent call last):
File “”, line 1, in
File “/home/dsaxx005/anaconda2/lib/python2.7/site-packages/theano/init.py”, line 116, in
theano.sandbox.cuda.tests.test_driver.test_nvidia_driver1()
File “/home/dsaxx005/anaconda2/lib/python2.7/site-packages/theano/sandbox/cuda/tests/test_driver.py”, line 41, in test_nvidia_driver1
raise Exception("The nvidia driver version installed with this OS "
Exception: The nvidia driver version installed with this OS does not give good results for reduction.Installing the nvidia driver available on the same download page as the cuda package will fix the problem: http://developer.nvidia.com/cuda-downloads

If I run “import theano” again, I get

import theano
Traceback (most recent call last):
File “”, line 1, in
File “/home/dsaxx005/anaconda2/lib/python2.7/site-packages/theano/init.py”, line 100, in
if hasattr(theano.tests, “TheanoNoseTester”):
AttributeError: ‘module’ object has no attribute ‘tests’

My $HOME/.theanorc

[global]
device = gpu
floatX = float32
[cuda]
root = /usr/local/cuda

I get the same error running everything from scratch and the latest packages with the appropriate changes (python3.6, cudnn7). I’ve been googling for solutions for the last few days and I’m really stuck…

Any thoughts?


(Robbin) #2

I could not for the life of me get theano working. I changed the backend to tensorflow and all is well…I can finally start learning deep learning! :smiley:


(Dmitry) #3

I’m having the same problem :frowning:

This should be the issue:


#4

Hi @rubdub - could you explain how you changed the backend to tensorflow?

Thanks!

Edit: configuring ~/.keras/keras.json ain’t doin’ it :frowning:


#5

Has anybody managed to find a solution? I am experiencing the same issue, tried searching around for several hours, no luck so for…

Many thanks in advance.

Update:
I somehow managed to get it working. Following the link in the error message (http://developer.nvidia.com/cuda-downloads), I installed the newest version of cuda, and I also installed a more recent version of theano than the one installed with the install-gpu.sh script.

I then changed .theanorc config to use the cuda backend instead of gpu:

[global]
#device = gpu
device = cuda
floatX = float32

[cuda]
root = /usr/local/cuda

I also added the following to my .bashrc file to define the MKL_THREADING_LAYER environment variable:

export MKL_THREADING_LAYER=GNU

After restarting the user session, the things started to work, and I was able to import theano without errors.

BTW, just as a sanity check - make sure that you are actually running your GUI session by using the nvidia display driver (lost some time on that, too).


I hope these hints prove helpful to somebody. Sorry if the post is not that well structured, but my memory is a bit fuzzy (was fixing the issue long into the night not taking notes) , and some important details might have been omitted.


#6

Hi @plamut
Thanks for the guidelines, i’ve followed them exactly with one thing that i’ve my machine setup on Azure instead of AWS, i’ve changed the .theanorc

[global]
device = cuda*
floatX = float32

[cuda]
root = /usr/local/cuda

I’ve also updated my ~/.bashrc file and added

export MKL_THREADING_LAYER=GNU

but unfortunately still getting the same below error when running

import utils; reload(utils)
from utils import plots

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-1-834d59d32016> in <module>()
----> 1 import utils; reload(utils)
      2 from utils import plots

/home/ubuntu/nbs/utils.py in <module>()
     26 from IPython.lib.display import FileLink
     27 
---> 28 import theano
     29 from theano import shared, tensor as T
     30 from theano.tensor.nnet import conv2d, nnet

/home/ubuntu/anaconda2/lib/python2.7/site-packages/theano/__init__.pyc in <module>()
    122 from theano.printing import pprint, pp
    123 
--> 124 from theano.scan_module import (scan, map, reduce, foldl, foldr, clone,
    125                                 scan_checkpoints)
    126 

/home/ubuntu/anaconda2/lib/python2.7/site-packages/theano/scan_module/__init__.py in <module>()
     39 __contact__ = "Razvan Pascanu <r.pascanu@gmail>"
     40 
---> 41 from theano.scan_module import scan_opt
     42 from theano.scan_module.scan import scan
     43 from theano.scan_module.scan_checkpoints import scan_checkpoints

/home/ubuntu/anaconda2/lib/python2.7/site-packages/theano/scan_module/scan_opt.py in <module>()
     58 
     59 import theano
---> 60 from theano import tensor, scalar
     61 from theano.tensor import opt, get_scalar_constant_value, Alloc, AllocEmpty
     62 from theano import gof

/home/ubuntu/anaconda2/lib/python2.7/site-packages/theano/tensor/__init__.py in <module>()
     15 from theano.tensor import opt
     16 from theano.tensor import opt_uncanonicalize
---> 17 from theano.tensor import blas
     18 from theano.tensor import blas_scipy
     19 from theano.tensor import blas_c

/home/ubuntu/anaconda2/lib/python2.7/site-packages/theano/tensor/blas.py in <module>()
    153 from theano.scalar import bool as bool_t
    154 from theano.tensor import basic as T
--> 155 from theano.tensor.blas_headers import blas_header_text
    156 from theano.tensor.blas_headers import blas_header_version
    157 from theano.tensor.opt import in2out, local_dimshuffle_lift

/home/ubuntu/anaconda2/lib/python2.7/site-packages/theano/tensor/blas_headers.py in <module>()
    985 
    986 
--> 987 if not config.blas.ldflags:
    988     _logger.warning('Using NumPy C-API based implementation for BLAS functions.')
    989 

/home/ubuntu/anaconda2/lib/python2.7/site-packages/theano/configparser.pyc in __get__(self, cls, type_, delete_key)
    330             except KeyError:
    331                 if callable(self.default):
--> 332                     val_str = self.default()
    333                 else:
    334                     val_str = self.default

/home/ubuntu/anaconda2/lib/python2.7/site-packages/theano/configdefaults.pyc in default_blas_ldflags()
   1406             res = try_blas_flag(flags)
   1407             if res:
-> 1408                 check_mkl_openmp()
   1409                 maybe_add_to_os_environ_pathlist('PATH', lib_path[0])
   1410                 return res

/home/ubuntu/anaconda2/lib/python2.7/site-packages/theano/configdefaults.pyc in check_mkl_openmp()
   1250         import mkl
   1251         if '2018' in mkl.get_version_string():
-> 1252             raise RuntimeError('To use MKL 2018 with Theano you MUST set "MKL_THREADING_LAYER=GNU" in your environement.')
   1253     except ImportError:
   1254         raise RuntimeError("""

RuntimeError: To use MKL 2018 with Theano you MUST set "MKL_THREADING_LAYER=GNU" in your environement.

Any thought or suggestions would be appreciated.

Thanks in advance,
Osama


#7

This is so weird, after lots of search, trial and error I’ve decided to reboot the VM and when it comes back, I’ve changed the device = gpu as it was originally setup and run the code again it run successfully, never thought MS turn it off and then back on workaround also applied on Linux just because it’s hosted on MS platform :slight_smile: