Jeremy's Harebrained install guide

You may consider posting to the Google Group: https://groups.google.com/a/tensorflow.org/forum/#!forum/swift

They’re pretty responsive.

working inside jupyter nb works fine for me but if i try to run swift at the command line to open the repl I still get this issue still with the latest version:

anyone have any insight on this? just want to make sure i don’t have something misconfigured

I’m not sure S4TF works with 10.1 yet. I could be wrong, but last I checked it required CUDA 10.0 and CUDNN 7.5. Make sure you run the ldconfig part of Jeremy’s install guide. I’d also sudo vim /etc/ld.so.conf to make sure there’s not an older version (or 10.1) referenced there. Here’s what mine looks like:

include /etc/ld.so.conf.d/*.conf
/usr/local/cuda-10.0/lib64

And here’s what my path looks like:

(base) j@j:~$ echo $PATH
/usr/local/cuda/bin:/home/j/dev/swift/stf-builds/development/usr/bin:/home/j/anaconda3/bin:/home/j/anaconda3/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/usr/lib/jvm/java-8-oracle/bin:/usr/lib/jvm/java-8-oracle/db/bin:/usr/lib/jvm/java-8-oracle/jre/bin
2 Likes

Did you ever find a solution to this?

I haven’t been brave enough to test and also I still don’t understand what I should do.

FWIW, as far as I can tell, I have 10.1 but it’s installed in /usr/local/cuda-10.0. i.e., there is no /usr/local/cuda-10.1 whereas there is e.g. /usr/local/cuda-9.2.

And S4TF seems to work just fine.

This is what nvidia-smi tells me:

nvidia-smi
Thu May  2 22:14:58 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.40.04    Driver Version: 418.40.04    CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+

Say if you find the solution. Hopefully someone else could help us. I have read some questions about similar problems but I just can’t find certain files they ask to change maybe because I have a little bit different problem. Like this https://askubuntu.com/questions/1025928/why-do-i-get-sbin-ldconfig-real-usr-local-cuda-lib64-libcudnn-so-7-is-not-a
When I run ls -lha libcudnn* it just says that there is no directories named that way.

Try upgrading your nvidia driver version. I did that and now mine is working. I went from 396 up to 418.56 and that sees to have fixed my issues. Shout-out to @hiromi for finding this solution. Hopefully it helps you as well!

Where is that answer? I only need to update Nvidia driver?

I would try that. It worked for me. You’re seeing it here first as far as I know. Basically Hiromi and I just looked at what differences there were between our two instances and that was the biggest difference. After upgrading that, it worked for me. Let me know if it works for you.

1 Like

image

I used the .run file directly. The steps for me:

  1. download new version run file with wget: here is the one I used.
    wget http://us.download.nvidia.com/XFree86/Linux-x86_64/418.56/NVIDIA-Linux-x86_64-418.56.run
    If you have anything besides a 1080 ti, go here to find the correct driver for you: https://www.nvidia.com/Download/Find.aspx
  2. Ctrl+Alt+F1 to switch to a command line.
  3. Kill the X server that is running the graphics: service lightdm stop
  4. enable execute on the .run file: chmod +x NVIDIA-Linux-x86_64-418.56.run
  5. run the .run file with sudo: sudo ./NVIDIA-Linux-x86_64-418.56.run
  6. follow the commands of the install (guessed on a lot of the yes nos)
  7. verify it worked properly: nvidia-smi

I think this is the correct order to get it working. Let me know if you run into any issues with this.

1 Like

image
:face_with_raised_eyebrow:

Maybe I just use some other computer for this. It seems like that this is too hard for me. Do I need GPU or can I use CPU on my laptop?

Can you try running this command:

systemctl status display-manager

cat /etc/X11/default-display-manager

I thought the same thing yesterday before Hiromi helped. We can get through this :slight_smile:

/usr/sbin/gdm3
Let’s try.

Ok, I think this will work for you instead of lightdm: service gdm3 stop

Then continue with the other steps.

Look’s good. Let me first try other steps.

image
I already tested to reboot the computer but still not working

Hmm, a reboot took that message away for me. Do you have anything starting automatically when you start your computer?

did you run the stop command again after reboot?