I was enjoying my little dual-boot Win10+Ubuntu with GTX 1080Ti server for the last 2 weeks until it became unstable this morning so I ran a bunch of “sudo apt-get install/update/upgrade”.
I can’t recall at what stage it went really wrong but suddenly I got flooded with pink-box messages, when starting notebooks, such as:
INFO (theano.gof.compilelock): Waiting for existing lock by process ‘2478’ (I am process ‘2680’)
INFO (theano.gof.compilelock): To manually release the lock, delete /home/eric/.theano/compiledir_Linux-4.8–generic-x86_64-with-debian-stretch-sid-x86_64-2.7.13-64/lock_dir
1 #define _CUDA_NDARRAY_C
3 #include <Python.h>
4 #include <structmember.h>
5 #include "theano_mod_helper.h"
7 #include <numpy/arrayobject.h>
10 #include "cuda_ndarray.cuh"
12 #ifndef CNMEM_DLLEXPORT
13 #define CNMEM_DLLEXPORT
16 #include "cnmem.h"
17 #include "cnmem.cpp"
19 //If true, when there is a gpu malloc or free error, we print the size of allocated memory on the device.
20 #define COMPUTE_GPU_MEM_USED 0
22 //If true, we fill with NAN allocated device memory.
23 #define ALLOC_MEMSET 0
25 //If true, we print out when we free a device pointer, uninitialize a
26 //CudaNdarray, or allocate a device pointer
27 #define PRINT_FREE_MALLOC 0
29 //If true, we do error checking at the start of functions, to make sure there
30 //is not a pre-existing error when the function is called.
31 //You probably need to set the environment variable
32 //CUDA_LAUNCH_BLOCKING=1, and/or modify the CNDA_THREAD_SYNC
33 //preprocessor macro in cuda_ndarray.cuh
34 //if you want this to work.
35 #define PRECHECK_ERROR 0
37 cublasHandle_t handle = NULL;
38 int* err_var = NULL;
I did multiple reinstall of Theano + Keras + CUDA: no success.
Then I wiped out Anaconda2 entirely, using the “Anaconda-clean” package from
followed by a brutal “rm -rf ~/anaconda2”.
And more tweaking here and there.
Now I can run Lesson1 cell #7 again, the “state of the art custom model in 7 lines of code with one epoch of Vgg16”.
It is slower than before: 307 sec vs. 205 sec, at least it runs.
But I keep having a nasty cuDNN message at launch:
Can not use cuDNN on context None: cannot compile with cuDNN. We got this error:
/tmp/try_flags_JuwE3B.c:4:19: fatal error: cudnn.h: No such file or directory
Mapped name None to device cuda: GeForce GTX 1080 Ti (0000:01:00.0)
Anyone encountered that ?