Dev Install problem on Paperspace


(Robert Bracco) #1

Hey guys, I am still having the same error, this time on a fresh notebook on paperspace gradient. I have made an editable install with developer prereqs using pip install -e ".[dev]"

I make a very minor code change that can’t be breaking anything, but when I call maketest or pytest I get the error

/bin/sh: 1: /usr/local/cuda/bin/nvcc: not found

On further inspection in /usr/local there are two folders, cuda/ and cuda-9.2/, neither has a bin folder.

Not sure where to go from here, I would love to start contributing but I need help.


Developer chat
(Stas Bekman) #2

Please always review https://docs.fast.ai/support.html when asking for help. We can’t help if you don’t provide the necessary details. In this particular case, we need at least the full traceback and the platform info as explained in the doc.

On further inspection in /usr/local there are two folders, cuda/ and cuda-9.2/, neither has a bin folder.

prebuilt pytorch is self-contained and doesn’t rely on system-wide cuda, so it doesn’t matter.

fastai doesn’t directly use nvcc anywhere, so it must be some subsystem that invokes that - a traceback will tell.


(Robert Bracco) #3

Sorry for the bad format, reviewed the support page and should be good on all other fronts as it’s a fresh paperspace gradient notebook running fast.ai 1.0+ template. Traceback is included below. Thank you.

python setup.py --quiet test
warning: no previously-included files matching ‘ **pycache** ’ found under directory ‘*’
warning: no files found matching ‘conf.py’ under directory ‘docs’
warning: no files found matching ‘Makefile’ under directory ‘docs’
warning: no files found matching ‘make.bat’ under directory ‘docs’
============================= test session starts ==============================
platform linux – Python 3.6.8, pytest-4.3.0, py-1.8.0, pluggy-0.9.0
rootdir: /notebooks/fastai-fork, inifile: setup.cfg
plugins: xdist-1.26.1, forked-1.0.2
collected 256 items / 1 errors / 255 selected

==================================== ERRORS ====================================
___________________ ERROR collecting tests/test_text_qrnn.py ___________________
/opt/conda/envs/fastai/lib/python3.6/site-packages/torch/utils/cpp_extension.py:946: in _build_extension_module
check=True)
/opt/conda/envs/fastai/lib/python3.6/subprocess.py:438: in run
output=stdout, stderr=stderr)
E subprocess.CalledProcessError: Command ‘[‘ninja’, ‘-v’]’ returned non-zero exit status 1.

During handling of the above exception, another exception occurred:
tests/test_text_qrnn.py:3: in 
from fastai.text.models import qrnn
fastai/text/models/qrnn.py:11: in 
forget_mult_cuda = load(name=‘forget_mult_cuda’, sources=[fastai_path/f for f in files])
/opt/conda/envs/fastai/lib/python3.6/site-packages/torch/utils/cpp_extension.py:645: in load
is_python_module)
/opt/conda/envs/fastai/lib/python3.6/site-packages/torch/utils/cpp_extension.py:814: in  <em>jit_compile
with_cuda=with_cuda)
/opt/conda/envs/fastai/lib/python3.6/site-packages/torch/utils/cpp_extension.py:863: in  <em>write_ninja_file_and_build
<em>build_extension_module(name, build_directory, verbose)
/opt/conda/envs/fastai/lib/python3.6/site-packages/torch/utils/cpp_extension.py:959: in  <em>build_extension_module
raise RuntimeError(message)
E RuntimeError: Error building extension ‘forget_mult_cuda’: [1/2] /usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=forget_mult_cuda -DTORCH_API_INCLUDE_EXTENSION_H -isystem /opt/conda/envs/fastai/lib/python3.6/site-packages/torch/lib/include -isystem /opt/conda/envs/fastai/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -isystem /opt/conda/envs/fastai/lib/python3.6/site-packages/torch/lib/include/TH -isystem /opt/conda/envs/fastai/lib/python3.6/site-packages/torch/lib/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/envs/fastai/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS</em></em>  -D__CUDA_NO_HALF_CONVERSIONS</em></em>  -D__CUDA_NO_HALF2_OPERATORS__ --compiler-options ‘-fPIC’ -std=c++11 -c /notebooks/fastai-fork/fastai/text/models/forget_mult_cuda_kernel.cu -o forget_mult_cuda_kernel.cuda.o
E FAILED: forget_mult_cuda_kernel.cuda.o
E /usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=forget_mult_cuda -DTORCH_API_INCLUDE_EXTENSION_H -isystem /opt/conda/envs/fastai/lib/python3.6/site-packages/torch/lib/include -isystem /opt/conda/envs/fastai/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -isystem /opt/conda/envs/fastai/lib/python3.6/site-packages/torch/lib/include/TH -isystem /opt/conda/envs/fastai/lib/python3.6/site-packages/torch/lib/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/envs/fastai/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --compiler-options ‘-fPIC’ -std=c++11 -c /notebooks/fastai-fork/fastai/text/models/forget_mult_cuda_kernel.cu -o forget_mult_cuda_kernel.cuda.o
E /bin/sh: 1: /usr/local/cuda/bin/nvcc: not found
E ninja: build stopped: subcommand failed.
!!! Interrupted: 1 errors during collection !!!
=========================== 1 error in 4.21 seconds ============================
Makefile:169: recipe for target ‘test’ failed
make: *** [test] Error 2```

#4

Let’s try to keep the developer chat for development announcements, moved you here to continue this discussion.


(Stas Bekman) #5

Much better, @MadeUpMasters. So it looks like that test builds a cuda extension, which requires nvcc available and probably some other nvidia cuda things. I guess all devs happened to have this so it was never a problem.

Please try again after git pull, I added a skip in that test module, so it’ll be skipped by default now.


(Robert Bracco) #6

Thanks for the help, I really appreciate it. I tried git pull and it said everything was up to date, so I ran git fetch upstream which yielded

remote: Counting objects: 100% (29/29), done.
remote: Total 35 (delta 29), reused 29 (delta 29), pack-reused 6
Unpacking objects: 100% (35/35), done.
From https://github.com/fastai/fastai
   2823a2c..9336a18  master     -> upstream/master

But from this point on if I run commands like git checkout master or git status I get the following error

tools/fastai-nbstripout -d: 1: tools/fastai-nbstripout -d: tools/fastai-nbstripout: not found
error: external filter tools/fastai-nbstripout -d failed -1
error: external filter tools/fastai-nbstripout -d failed
fatal: courses/dl1/adamw-sgdw-demo.ipynb: clean filter 'fastai-nbstripout-docs' failed

Looking in my fastai-fork folder it seems that it is missing a bunch of folders (alphabetically the last one is docs), my fork on github doesn’t have that problem. It is also telling me that they are up to date and in sync.

I’ll go ahead and create a fresh notebook and start over again this afternoon, but I’d appreciate it if you could let me know if I did something wrong to break it. Thanks.


(Stas Bekman) #7

Yes, it does appear you are missing files. That’s highly unusual.

  1. first try the origin master - can you clone and work with it (i.g. status, checkout, pull)?
  2. then try our branching script https://docs.fast.ai/dev/git.html#helper-program and check the same.

let me know if any of these has a problem.


(Robert Bracco) #8

Hey, thanks again, I tried both and had no success, I just continued getting

tools/fastai-nbstripout -d: 1: tools/fastai-nbstripout -d: tools/fastai-nbstripout: not found
error: external filter tools/fastai-nbstripout -d failed -1
error: external filter tools/fastai-nbstripout -d failed
fatal: courses/dl1/adamw-sgdw-demo.ipynb: clean filter 'fastai-nbstripout-docs' failed

due to missing files. When I pull origin-master it says I’m up to date, but I couldn’t do anything else. I’m fairly new to git so I’m not totally sure if I’m doing something else wrong, but I will go ahead and do another fresh install and start from scratch. If that breaks I’ll let you know. Thanks for the help.


(Stas Bekman) #9

This is very strange. You don’t need to install anything to get this working.

You should be able to do at least this:

git clone https://github.com/fastai/fastai
tools/run-after-git-clone
git pull

You won’t be able to commit from it, but if it works - it should help you to compare it with your fork. Let me know if this approach works.

In any case, for future follow ups, please always paste the exact commands you tried, otherwise when you say “I tried both and had no success” I don’t know what you did exactly.


(Robert Bracco) #10

Hey, thanks again for the help with this. I’m making a fresh notebook on paperspace and I’ll follow the guide again to see if I can get it working. I’ll go slow and post here with specific commands and full stack traces for any problems I encounter. Cheers.


(Stas Bekman) #11

Perhaps one of the paperspace folks can help you here, clearly it has something to do with Paperspace. @dillon, perhaps you could help @MadeUpMasters to have fastai dev setup on paperspace, unless this is not possible, so that perhaps we are wasting our time here trying to figure it out. Thanks.


(Robert Bracco) #12

I’m not married to paperspace, there’s a lot I don’t like about it. Is there a platform the majority of the devs are on? Or do they tend to have their own machines?


(Stas Bekman) #13

we use our own machines mostly.