Unofficial Setup thread (Local, AWS)

gauravjbjain · October 12, 2018, 4:18am

thanks Jeremy, I’ll be careful…

pnvijay · October 12, 2018, 4:32am

V2 of the course had an AMI in AWS for Fast.ai which we were using for the course. For V3 of the course can we have the same? or We have to follow the steps prescribed in the start of this thread and do it ourselves.

init_27 · October 12, 2018, 4:34am

Please use the steps to set it up, it’s just 3 commands to run.

I’ll be happy to create a Public AMI but I’m not sure if many people are going to use AWS.

pnvijay · October 12, 2018, 4:37am

Thanks @init_27. Will set up the AWS instance following the steps. Although AWS might be costlier than the other options, I somehow prefer it to the rest. Salamander is interesting though.

jeremy · October 12, 2018, 5:13am

We’ll be providing plenty of simple options for GPU servers once the course starts.

marcmuc · October 12, 2018, 6:31am

396 is not 396 though, depending how/where you install it from and depending on your card I had the same problems as @liberus until I found the post below.
TLDR: 396.26 works on Tesla cards not on GeForce GTX, 396.24 is the one to use with GeForce Cards. That helped me make it work. But the 396.26 came bundled with the cuda installers. But that was a while ago, I have upgraded since and no problems with 396.54 currently running.
Source:

data-drone · October 12, 2018, 9:27am

I followed these instructions and for the last two statements:

python -c 'import fastai; print(fastai.__version__)'

and

python -c 'import fastai; fastai.show_install(0)'

I got:

Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/opt/conda/lib/python3.5/site-packages/fastai/__init__.py", line 1, in <module>
from .basic_train import *
File "/opt/conda/lib/python3.5/site-packages/fastai/basic_train.py", line 92
     data:DataBunch
         ^
SyntaxError: invalid syntax

I installed into a docker container and have put up docker file here:
https://github.com/Data-drone/dl_toolkit/blob/master/docker_fastai/Dockerfile

howkhang · October 12, 2018, 10:29am

If you’re using docker, you could pull the latest pytorch 1.0 docker image from https://hub.docker.com/r/floydhub/pytorch/tags/ as it comes together with fastai 1.0.

Benoit_c · October 12, 2018, 11:38am

I’ve made the change in the Wiki, but if someone want to do a clean install with CUDA 9.2 and 396 driver, have a look to https://www.pugetsystems.com/labs/hpc/How-to-install-CUDA-9-2-on-Ubuntu-18-04-1184/ (edit : already posted above!)
I understand that cuda runtime is installed by conda and that’s all we need.

devforfu · October 12, 2018, 3:35pm

A wrong version of Python interpreter, I guess.

As I can see from the source code, the required version is 3.7 (the most recent one). Or, probably you can install 3.6 as well but with dataclasses backported.

wgpubs · October 12, 2018, 6:21pm

Also, as an FYI, I think this may be one of the most comprehensive articles for folks wanting to setup their own deep learning rig for both remote/local access.

We’re using for both personal and professional development against both the previous version of fastai and also the latest pytorch/fastai builds.

_venkat · October 12, 2018, 8:44pm

A little bummed out that Windows is not fully supported. I was hoping to use the weekend to try and set things on my laptop using the Windows installation thread, just to see how well my laptop handles simple programs if not anything else.

Edward · October 12, 2018, 10:34pm

Although it’s a pity that it doesn’t work with Windows, the installation as described in this post is straight forward and worked well (thanks for the easy to follow and succinct instructions!). Also don’t fret about the the Ubuntu install - it’s also straight forward and works a treat. It’s probably just as well to get used to dabbling with a lInux environment if you have little or no earlier experience of it as as I suspect as one ends up on the command line now and again in this business

Edward · October 12, 2018, 11:05pm

The local install instructions worked superbly and now I have local system for doing long haul processing on the cheap (seeing as I hade a GTX 1060 card knocking about). The only tiny hiccup was the command

sudo modprobe nvidia

which returned nothing and caused me a little concern. So I tried the command

nvidia-smi -L
GPU 0: GeForce GTX 1060 6GB (UUID: GPU-0305697e-a6fc-e62c-3fd3-bba1a43fcb9e)

Which looked promising!
In the end everything installed and fired up correctly all the same, but perhaps a note should be added not to worry if there is no output from the modprobe command?

init_27 · October 12, 2018, 11:18pm

@Edward You can read about modprobe.
In general, no errors = no issues

init_27 · October 12, 2018, 11:19pm

It will definitely be helpful and is highly recommended.

Please feel free to add links to any linux and command line tutorials that think would be worth a mention.

helena · October 13, 2018, 2:14am

i have a local rig, Started with GTX 1080 about a year and a half ago and recently ungraded to GTX 1080 TI. Decided to stick with Ubuntu 16.04 - the rest (nvidia drivers/ CUDA etc) was the latest
my observations - troubleshooting by googling is becoming harder and harder - you can google for your Python question and find a decent answer but that won’t work for your CUDA/Ubuntu whatever - i mean i coded my compilers/protocols in the cubicles of Bell Labs and still have trouble to parse the answers on askubuntu and such

Edward · October 13, 2018, 5:53am

My bad, I misunderstood “nvidia-smi” as a return value.

ajans1 · October 13, 2018, 10:01am

I was following the instructions in the OP on a fresh ubuntu 18.04 LTS install and all was good until:

python -c 'import torch; print(torch.cuda.device_count()); '

Which returned 0, I also tried

python -c 'import torch; print(torch.cuda.is_available()); '

Which returned “False”. I tried a bunch of things to no avail, until I saw the post by “saltybald” in this thread:

I took his advice and installed the 396 nvidia drivers with sudo apt install nvidia-driver-396 (having previously installed the 390 drivers, as per the instructions), did a restart and suddenly things started working.

_venkat · October 13, 2018, 12:31pm

The only thing I’m worried about is installing Linux on my laptop. I’d been a Linux user for about 10 years (with a dual-booted Windows XP running as a backup, but it was rarely used.) That was before the UEFI/secure boot thing came about; my setup was simple enough that I didn’t have to worry about breaking anything or having driver issues. Now, however, I have a laptop for which I shelled out a lot. I’m not even sure if I can send this model (XPS 9560) for repairs where I live if something were to break.

When I bought it I had all it planned out: wait for first LTS point release to show up, read reviews and install sometime in mid-July if all was good. By the time June rolled around, I’d become too scared (and possibly a bit lazy) to tamper with the laptop. I have Ubuntu running in a VM, but that won’t be any good.