Unofficial Setup thread (Local, AWS)

Pavan · October 20, 2018, 4:32pm

I tried following these steps in colab.
I am getting ModuleNotFoundError: No module named ‘fastai.data’.

i see Successfully installed dataclasses-0.6 fastai-1.0.7 fastprogress-0.1.10 ipywidgets-7.4.2 jupyter-1.0.0 jupyter-console-6.0.0 numpy-1.15.2 qtconsole-4.4.2 torchvision-nightly-0.2.1 widgetsnbextension-3.4.2…

I tried doing !pip install fastai.data.
Still can’t install it.
I am able to import fastai.vision successfully. Any suggestions?

I am able to run vision example provided properly. Issue comes only with fastai.data.

nok · October 21, 2018, 9:59am

For the GCP, tags of http and https should be added. Also, optionally you can add a preemptive tag to get a cheaper price for trade off.

gcloud compute instances add-tags "pytorch-and-fastai-box" --tags jupyter,https-server,http-server

gcloud compute instances create $INSTANCE_NAME \
      --preemptible \
      --zone=$ZONE \
      --image-family=$IMAGE_FAMILY \
      --machine-type=$INSTANCE_TYPE \
      --boot-disk-size=50GB \
      --boot-disk-type "pd-ssd" \
      --image-project=deeplearning-platform-release \
      --maintenance-policy=TERMINATE \
      --accelerator="type=nvidia-tesla-k80,count=1" \
      --metadata="install-nvidia-driver=True" \
      --no-boot-disk-auto-delete \
      --preemptible

(Thanks for whoever posting the original script, I struggle for a day about the nvidia driver, this command install everything while creating the instance, perfect!)

gshashank84 · October 21, 2018, 4:22pm

I think they have removed it, thanks for sharing the issue. I have removed it from the Wiki.

noskill · October 21, 2018, 7:45pm

Hey! I just created a script for GCP that should setup and get jupyter notebook running within few minutes.
Link - github
The mentioned steps are working for me. Please try and give feedbacks so that it can be improved.
Happy learning!!

Antti · October 21, 2018, 8:17pm

Hello,

I am doing local installation, and have managed to

(1) update NVIDIA driver to 396.24
(2) run
conda install -c pytorch pytorch-nightly cuda92
and got output stating a version 1.0.0.dev…

However, this piece
python -c 'import fastai; fastai.show_install(0)'
now shows torch cuda version 9.0.176 (instead of 9.2). I don’t know whether this is what should have happened.

(3) Then I ran these
conda install -c fastai torchvision-nightly
conda install -c fastai fastai

…and the checks showed the correct outcome

However, when trying to run the lesson1 notebook, I get import errors. First on bcolz (which I installed with conda), and after that with cv2. It says ‘no module named cv2’.

Any guidance on what might have gone wrong, or is this to be expected?

matdmiller · October 21, 2018, 9:39pm

Did you run into this error when running the lesson 1 (part1v3) notebook in colab?

@sgugger has posted solution although it will slow things down. I ran into the same problem which is Docker related and Colab is (I beleive) built on containers. I also posted a solution on the other thread but it is part of the docker run command and I doubt the “docker run” type of fix will help w/ Colab. It may be worth testing this and writing this into your Colab Guide if others are going to try and use it.

veehoo · October 22, 2018, 9:28am

Hello,

I’ve tried to update my local computer to fastai v1 setup. it’s seemingly working, I can train simple networks on pytorch using gpu, and e.g. the following looks sane:


python -c 'import fastai; fastai.show_install(0)'

```text
=== Software === 
python version  : 3.6.5
fastai version  : 1.0.11
torch version   : 1.0.0.dev20181020
nvidia driver   : 410.57
torch cuda ver  : 9.2.148
torch cuda is   : available
torch cudnn ver : 7104
torch cudnn is  : enabled

=== Hardware === 
nvidia gpus     : 1
torch available : 1
  - gpu0        : 8119MB | GeForce GTX 1070

=== Environment === 
platform        : Linux-4.18.14-arch1-1-ARCH-x86_64-with-arch
distro          : #1 SMP PREEMPT Sat Oct 13 13:42:37 UTC 2018
conda env       : fastai
python          : /home/veehoo/.conda/envs/fastai/bin/python
sys.path        : 
/home/veehoo/.conda/envs/fastai/lib/python36.zip
/home/veehoo/.conda/envs/fastai/lib/python3.6
/home/veehoo/.conda/envs/fastai/lib/python3.6/lib-dynload
/home/veehoo/.local/lib/python3.6/site-packages
/home/veehoo/.conda/envs/fastai/lib/python3.6/site-packages
/home/veehoo/.conda/envs/fastai/lib/python3.6/site-packages/cycler-0.10.0-py3.6.egg
/home/veehoo/.conda/envs/fastai/lib/python3.6/site-packages/IPython/extensions

But when I try to use the fastai stack, there’s something that triggers an error in multiprocessing code. The following error comes when I try to execute ‘examples/tabular.ipynb’ from fastai github in jupyter. The error comes after executing the last command in the notebook:


...
learn.fit(1, 1e-2)


Exception in thread Thread-4:
Traceback (most recent call last):
  File "/home/veehoo/.conda/envs/fastai/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/home/veehoo/.conda/envs/fastai/lib/python3.6/threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "/home/veehoo/.conda/envs/fastai/lib/python3.6/multiprocessing/resource_sharer.py", line 139, in _serve
    signal.pthread_sigmask(signal.SIG_BLOCK, range(1, signal.NSIG))
  File "/home/veehoo/.conda/envs/fastai/lib/python3.6/signal.py", line 60, in pthread_sigmask
    sigs_set = _signal.pthread_sigmask(how, mask)
ValueError: signal number 32 out of range

Has anybody seen this kind of problem before? I found at least something possibly pytorch related from google. e.g. https://github.com/petronetto/pytorch-alpine/issues/3.

br, Ville

veehoo · October 22, 2018, 11:31am

Hi Antti,

Seems to me you’re not running the jupyter-notebook session in the right conda environment, so it’s detecting the host python and cuda installations and not the conda ones. Could that be the case?

br, Ville

nok · October 22, 2018, 12:34pm

Are we going to use the course_v3 repository?

arunoda · October 22, 2018, 4:22pm

I think yes. First lesson is already there and it was so easy follow and use compare to the previous version.

prsahu · October 22, 2018, 5:10pm

I am getting the following error on executing
python -c ‘import fastai; fastai.show_install(0)’

init_27 · October 22, 2018, 5:11pm

Looks like you’re using py2.7. That might be the issue.

Antti · October 22, 2018, 5:31pm

Thanks - works now!

hasib_zunair · October 22, 2018, 6:00pm

Is this okay?

hasib_zunair · October 22, 2018, 6:09pm

Got this below, any work around on this?

RuntimeError: cuda runtime error (48) : no kernel image is available for execution on the device at /opt/conda/conda-bld/pytorch-nightly_1539431435477/work/aten/src/THC/generic/THCTensorMath.cu:14

prsahu · October 22, 2018, 6:28pm

Which version I should go for?

prsahu · October 22, 2018, 6:45pm

Anybody has setup this on Mac with eGPU support?

cwerner · October 22, 2018, 7:45pm

Hi

I try your approach but I get this:
ERROR: (gcloud.beta.compute.instances.create) Could not fetch resource:

Required ‘compute.images.useReadOnly’ permission for ‘projects/fastai-v1-219409/global/images/fastai-image-v3’

Any idea what is happening?

noskill · October 22, 2018, 7:51pm

yeah! thanks for feedback.
Actually i spinned an instance from marketplace-image of fastaiv1.0 + pytorch1.0 and then created an image of that disk attached to instance.
Marketplace images takes too long time to load and we need to install nvidia-drivers everytime, so i created the image and made it public so that it can be used directly. There is some error with the permissions of the image, i will check that out and revert to you.

cwerner · October 22, 2018, 7:52pm

Thanks