Installation of Nvidia Cuda Toolkit and CuDNN library

RockShox · June 22, 2020, 5:46am

Hi, everyone. I am trying to install Nvidia Cuda Toolkit and CuDNN library in my laptop (OS-windows 10).
I am facing version compatibility problem while installing the above two.
Can anyone please suggest what will be the correct versions of -
Python, Anaconda, Tensorflow, Cuda Toolkit, CuDNN library
to run deeplearning algos in my system.

moon · June 23, 2020, 9:25pm

I’ve been able to install Cuda and CUDNN on Windows to use for tensorflow with Rstudio and jupyter/python for the first half of the fast AI first-course years ago (when the first classifier was still dogs vs cats and not dog breeds), but in the end it was not worth the time. I will work on re-installing CUDA/CUDNN on my Windows laptop to see if I can get a better answer for you in 2020 for Windows (this will be provided at later a later date)

much more prefered than installing CUDA on Windows: for python/tensorflow and just getting started: your best option are the Ready to run: “One-click” Jupyter options , especially the free ones, provided on link below
https://course.fast.ai/
Next consider that there are a lot of tensorflow.js resources for immediately getting into machine learning concepts on virtually any hardware
The Coding Train
https://ml5js.org/
https://teachablemachine.withgoogle.com/ (old but good)
https://www.tensorflow.org/js/tutorials/transfer/what_is_transfer_learning (this is a fun demo to start with)

Tensorflow.js runs WebGL for GPU acceleration on Nvidia/AMD/Intel , and probably any ARM processors that allocate enough memory to the GPU (currently my raspberry PI 4 4GB is not compatible with the basic MNIST demo). There is a future of even better acceleration backends to be implemented later by Tensorflow.js.

if not wanting to start with javascript, next consider the following suggested procedure instead for Linux on your own hardware:

be in a situation where you are ok installing an ubuntu flavor as your main OS on your laptop, or just have two laptops, or be able to partition your laptop to run an Ubuntu flavor
download https://pop.system76.com/ 20.04 LTS with the proprietary NVIDIA driver preinstalled (this is an Ubuntu flavor)
setup https://github.com/NVIDIA/nvidia-docker (this actually supports a number of linux distributions, not just Ubuntu based)
setup a tensorflow-docker image (the latest docker image for TF2.2 as of this posting is still running 18.04 inside, this is fine as the tensorflow-docker image contains CUDA as well as CUDNN in a configuration that is known to work (image generated by Google) )
https://www.tensorflow.org/install/docker

docker pull tensorflow/tensorflow:latest-gpu-jupyter

understand how to be sure docker is opening a port on your host machine to connect to jupyter, and how to use docker commands to directly use bash in your docker image, among other docker commands; this is probably the most difficult step
https://hub.docker.com/r/tensorflow/tensorflow

5.1) Also you may need to specify a GPU number as well as ipc=host to use a GPU

docker run --gpus 1 -it --ipc="host" --rm someidentifier/someimagename

remember to set
TF_FORCE_GPU_ALLOW_GROWTH to true inside the docker container as an environment variable (sounds confusing but last I checked this is needed for Windows as well).
https://www.tensorflow.org/guide/gpu

===
Maintaining tensorflow and Cuda/CUDNN on any installation is a problem there are plenty of people working on innovative ideas to address the problem, for example tensorman is an attempt to make docker images easier to use.

As much I like system76 and their tensorman effort, their linux distribution is probably more valuable in that they provide a version that installs nvidia drivers right out of the box. I’m sure someone smarter than me can find a way to integrate tensorman into regular tutorial workflows, but I’ve not found it very convenient.

There are other options that do not require a powerful GPU or even tensorflow.

Pytorch is MKL enabled, as well is this GluonCV by Apache mxnet
https://gluon-cv.mxnet.io/install.html

pip install --upgrade mxnet-mkl gluoncv

https://gluon-cv.mxnet.io/tutorials/index.html

MKL-enabled GluonCV is markedly faster on my CPU (AMD R5 3600) than the version without MKL.

moon · June 24, 2020, 7:52pm

for Windows: these versions are the compatible versions of cudatoolkit, cudnn, and tensorflow installed through conda 2020-JUN24 (there are many other packages omitted (***) )

Name Version Build

cudatoolkit 10.1.243 h74a9793_0
cudnn 7.6.5 cuda10.1_0

tensorboard 2.2.1 pyh532a8cf_0
tensorboard-plugin-wit 1.6.0 py_0
tensorflow 2.1.0 gpu_py37h7db9008_0
tensorflow-base 2.1.0 gpu_py37h55f5790_0
tensorflow-estimator 2.1.0 pyhd54b08b_0
tensorflow-gpu 2.1.0 h0d30ee6_0

Assuming basic familiarity with powershell and anaconda, following commands were applied to install these packages

conda update -n base -c defaults conda
conda create -n envirName
conda activate envirName
conda install tensorflow-gpu

To verify functionality, grab this notebook
https://www.tensorflow.org/tutorials/quickstart/beginner
https://storage.googleapis.com/tensorflow_docs/docs/site/en/tutorials/quickstart/beginner.ipynb

copy to the current directory and run in jupyter (installed through anaconda-navigator) to see that CUDA/CUDNN are working. Append to the end of the notebook additional code to see how tensorflow will report on versions of CUDA/CUDNN (not necessarily the same number of significant digits in the version number).

from tensorflow.python.platform import build_info as tf_build_info
print(tf_build_info.cuda_version_number)

10.1

print(tf_build_info.cudnn_version_number)

7.6

additionally stackoverflow link below has links to TF documentation on what the tested-build combinations are:

Anaconda Individual Python 3.7
64-Bit Graphical Installer (466 MB)

Windows specifications
Edition Windows 10 Home
Version 2004

CPU AMD Ryzen
SYSTEM RAM 16.0 GB 3200Mhz DDR4
GPU NVIDIA 1660ti
VRAM 6GB GDDR6