Unofficial Setup thread (Local, AWS)

I tried following these steps in colab.
I am getting ModuleNotFoundError: No module named ‘fastai.data’.

i see Successfully installed dataclasses-0.6 fastai-1.0.7 fastprogress-0.1.10 ipywidgets-7.4.2 jupyter-1.0.0 jupyter-console-6.0.0 numpy-1.15.2 qtconsole-4.4.2 torchvision-nightly-0.2.1 widgetsnbextension-3.4.2…

I tried doing !pip install fastai.data.
Still can’t install it.
I am able to import fastai.vision successfully. Any suggestions?

I am able to run vision example provided properly. Issue comes only with fastai.data.

For the GCP, tags of http and https should be added. Also, optionally you can add a preemptive tag to get a cheaper price for trade off.

gcloud compute instances add-tags "pytorch-and-fastai-box" --tags jupyter,https-server,http-server

gcloud compute instances create $INSTANCE_NAME \
      --preemptible \
      --zone=$ZONE \
      --image-family=$IMAGE_FAMILY \
      --machine-type=$INSTANCE_TYPE \
      --boot-disk-size=50GB \
      --boot-disk-type "pd-ssd" \
      --image-project=deeplearning-platform-release \
      --maintenance-policy=TERMINATE \
      --accelerator="type=nvidia-tesla-k80,count=1" \
      --metadata="install-nvidia-driver=True" \
      --no-boot-disk-auto-delete \
      --preemptible

(Thanks for whoever posting the original script, I struggle for a day about the nvidia driver, this command install everything while creating the instance, perfect!)

3 Likes

I think they have removed it, thanks for sharing the issue. I have removed it from the Wiki.

Hey! I just created a script for GCP that should setup and get jupyter notebook running within few minutes.
Link - github
The mentioned steps are working for me. Please try and give feedbacks so that it can be improved.
Happy learning!!

Hello,

I am doing local installation, and have managed to

(1) update NVIDIA driver to 396.24
(2) run
conda install -c pytorch pytorch-nightly cuda92
and got output stating a version 1.0.0.dev…

However, this piece
python -c 'import fastai; fastai.show_install(0)'
now shows torch cuda version 9.0.176 (instead of 9.2). I don’t know whether this is what should have happened.

(3) Then I ran these
conda install -c fastai torchvision-nightly
conda install -c fastai fastai

…and the checks showed the correct outcome

However, when trying to run the lesson1 notebook, I get import errors. First on bcolz (which I installed with conda), and after that with cv2. It says ‘no module named cv2’.

Any guidance on what might have gone wrong, or is this to be expected?

Did you run into this error when running the lesson 1 (part1v3) notebook in colab?

@sgugger has posted solution although it will slow things down. I ran into the same problem which is Docker related and Colab is (I beleive) built on containers. I also posted a solution on the other thread but it is part of the docker run command and I doubt the “docker run” type of fix will help w/ Colab. It may be worth testing this and writing this into your Colab Guide if others are going to try and use it.

Hello,

I’ve tried to update my local computer to fastai v1 setup. it’s seemingly working, I can train simple networks on pytorch using gpu, and e.g. the following looks sane:


python -c 'import fastai; fastai.show_install(0)'

```text
=== Software === 
python version  : 3.6.5
fastai version  : 1.0.11
torch version   : 1.0.0.dev20181020
nvidia driver   : 410.57
torch cuda ver  : 9.2.148
torch cuda is   : available
torch cudnn ver : 7104
torch cudnn is  : enabled

=== Hardware === 
nvidia gpus     : 1
torch available : 1
  - gpu0        : 8119MB | GeForce GTX 1070

=== Environment === 
platform        : Linux-4.18.14-arch1-1-ARCH-x86_64-with-arch
distro          : #1 SMP PREEMPT Sat Oct 13 13:42:37 UTC 2018
conda env       : fastai
python          : /home/veehoo/.conda/envs/fastai/bin/python
sys.path        : 
/home/veehoo/.conda/envs/fastai/lib/python36.zip
/home/veehoo/.conda/envs/fastai/lib/python3.6
/home/veehoo/.conda/envs/fastai/lib/python3.6/lib-dynload
/home/veehoo/.local/lib/python3.6/site-packages
/home/veehoo/.conda/envs/fastai/lib/python3.6/site-packages
/home/veehoo/.conda/envs/fastai/lib/python3.6/site-packages/cycler-0.10.0-py3.6.egg
/home/veehoo/.conda/envs/fastai/lib/python3.6/site-packages/IPython/extensions

But when I try to use the fastai stack, there’s something that triggers an error in multiprocessing code. The following error comes when I try to execute ‘examples/tabular.ipynb’ from fastai github in jupyter. The error comes after executing the last command in the notebook:


...
learn.fit(1, 1e-2)


Exception in thread Thread-4:
Traceback (most recent call last):
  File "/home/veehoo/.conda/envs/fastai/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/home/veehoo/.conda/envs/fastai/lib/python3.6/threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "/home/veehoo/.conda/envs/fastai/lib/python3.6/multiprocessing/resource_sharer.py", line 139, in _serve
    signal.pthread_sigmask(signal.SIG_BLOCK, range(1, signal.NSIG))
  File "/home/veehoo/.conda/envs/fastai/lib/python3.6/signal.py", line 60, in pthread_sigmask
    sigs_set = _signal.pthread_sigmask(how, mask)
ValueError: signal number 32 out of range

Has anybody seen this kind of problem before? I found at least something possibly pytorch related from google. e.g. https://github.com/petronetto/pytorch-alpine/issues/3.

br, Ville

Hi Antti,

Seems to me you’re not running the jupyter-notebook session in the right conda environment, so it’s detecting the host python and cuda installations and not the conda ones. Could that be the case?

br, Ville

Are we going to use the course_v3 repository?

I think yes. First lesson is already there and it was so easy follow and use compare to the previous version.

I am getting the following error on executing
python -c ‘import fastai; fastai.show_install(0)’

Looks like you’re using py2.7. That might be the issue.

2 Likes

Thanks - works now!

Is this okay?

Got this below, any work around on this? :frowning:

RuntimeError: cuda runtime error (48) : no kernel image is available for execution on the device at /opt/conda/conda-bld/pytorch-nightly_1539431435477/work/aten/src/THC/generic/THCTensorMath.cu:14

Which version I should go for?

Anybody has setup this on Mac with eGPU support?

Hi

I try your approach but I get this:
ERROR: (gcloud.beta.compute.instances.create) Could not fetch resource:

  • Required ‘compute.images.useReadOnly’ permission for ‘projects/fastai-v1-219409/global/images/fastai-image-v3’

Any idea what is happening?

yeah! thanks for feedback.
Actually i spinned an instance from marketplace-image of fastaiv1.0 + pytorch1.0 and then created an image of that disk attached to instance.
Marketplace images takes too long time to load and we need to install nvidia-drivers everytime, so i created the image and made it public so that it can be used directly. There is some error with the permissions of the image, i will check that out and revert to you.

Thanks