Fastai v1 install issues thread


(Kristian Rother) #309

Hi,

I did some quick tests and there must have been a change in how the libraries are loaded or something going from fast.ai 1.0.41->1.0.42. Everything works fine with both pytorch 1.0.0 and pytorch 1.0.1 in fastai 1.0.41 but after upgrading to fastai .42 and above the “NVML Shared Library Not Found” error appears (when running from fastai.text import *). When I downgrade to fastai .41 everything works fine again so I’m assuming it must be a fast.ai change that broke things.

Not a huge issue, I’m working with .41 for now but it would be nice to have load_data() etc. from the most recent version :slight_smile:

I’m currently training a Language Model so the computer will be blocked for some days but I can re-break things and provide a full report later if desired.

tl;dr: I think it might actually be a fast.ai change that breaks things on Windows and not a pytorch/windows issue since I’m fairly sure my pytorch/cuda is set up correctly (and works nicely with fastai <1.0.42 both for a forced pytorch 1.0.0 and the current version)

(Windows 10)


Nvml.dll loading issue in nvidia-ml-py3-7.352.0-py_0
(Stas Bekman) #310

Ah, then it has to do with pynvml.

there are two parts to this reply

First, can you run:

python -c "import pynvml; pynvml.nvmlInit()"

perhaps single quotes on windows, if double aren’t working, or put it in a script. and a full backtrace on error please.

and did you install with conda or pip? Either way fastai dependencies should have installed this package nvidia-ml-py3 It’s supposed to work on windows. Have a look at the homepage of this project: https://github.com/nicolargo/nvidia-ml-py3


Second, whenever you report installation errors please follow the guide
https://docs.fast.ai/support.html#reporting-issues
and in general always provide a full stack backtrace. Because we need to know where this error has occurred.

Therefore, please do that now for the original report of the “NVML Shared Library Not Found” error appears (when running from fastai.text import *) and we can look at why this happened in the first place, as it shouldn’t need to load nvml for that functionality.

Finally, can you do that with 1.0.46? so that we are testing the latest code-base?

Thank you.


(Stas Bekman) #311

Looks like we now have someone who understands how to fix that, see this thread:


(Partho P. Das) #312

@Preka @rother can you guys please run the following command in powershell and send me the output. Preferably on this thread

cd $env:SystemDrive
dir -rec -filt *nvml*.dll -ea SilentlyContinue | % { $_.FullName }

I want to make sure there isnt a 3 path that needs to be added to the search paths - for now at least.

Also I have confirmations from 2 other Win10+GPU machines that nvml.dll is under system32.


(Battle500) #313

I can confirm that is a fastai issue rather than pytorch one.
Furthermore, it is a windows problem, as I have also tried fastai 1.0.42 on virtual Ubuntu environment in Windows and it upgraded without any problem.


(Battle500) #314

Hey man,

well, File Not Found in Windows\System32.


(Tejaswini M) #315

I’ve been trying to get fastai v1 to work on my Ubuntu 18.04 installation. I get the following error and traceback -

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-2-0ce5c57f63e0> in <module>
      1 # Import necessary libraries
      2 from fastai import *
----> 3 from fastai.vision import *
      4 import matplotlib.pyplot as plt

ModuleNotFoundError: No module named 'fastai.vision'

To my knowledge, I used the exact installation instructions. There is a section on this error in the Troubleshooting document, but since I only have one environment, I did not find it to be useful.

Details of my installation are below -

=== Software === 
python        : 3.6.8
fastai        : 1.0.47.dev0
fastprogress  : 0.1.20
torch         : 1.0.1.post2
nvidia driver : 415.27
torch cuda    : 9.0.176 / is available
torch cudnn   : 7402 / is enabled

=== Hardware === 
nvidia gpus   : 1
torch devices : 1
  - gpu0      : 4042MB | GeForce MX150

=== Environment === 
platform      : Linux-4.15.0-45-generic-x86_64-with-debian-buster-sid
distro        : Ubuntu 18.04 bionic
conda env     : fastai
python        : /home/tejaswini/anaconda3/envs/fastai/bin/python
sys.path      : 
/home/tejaswini/anaconda3/envs/fastai/lib/python36.zip
/home/tejaswini/anaconda3/envs/fastai/lib/python3.6
/home/tejaswini/anaconda3/envs/fastai/lib/python3.6/lib-dynload
/home/tejaswini/.local/lib/python3.6/site-packages
/home/tejaswini/anaconda3/envs/fastai/lib/python3.6/site-packages

Tue Mar  5 16:56:03 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 415.27       Driver Version: 415.27       CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce MX150       Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   45C    P8    N/A /  N/A |    269MiB /  4042MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1334      G   /usr/lib/xorg/Xorg                           138MiB |
|    0      1510      G   /usr/bin/gnome-shell                         130MiB |
+-----------------------------------------------------------------------------+

Output of python --version

Python 3.6.8 :: Anaconda, Inc.

Output of nvidia-smi

Tue Mar  5 16:56:24 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 415.27       Driver Version: 415.27       CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce MX150       Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   43C    P8    N/A /  N/A |    269MiB /  4042MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1334      G   /usr/lib/xorg/Xorg                           138MiB |
|    0      1510      G   /usr/bin/gnome-shell                         130MiB |
+-----------------------------------------------------------------------------+

Output of which python

/home/tejaswini/anaconda3/envs/fastai/bin/python

Output of which jupyter

/home/tejaswini/anaconda3/envs/fastai/bin/jupyter

Output of print(sys.path)

['/home/tejaswini/anaconda3/envs/fastai/lib/python36.zip', '/home/tejaswini/anaconda3/envs/fastai/lib/python3.6', '/home/tejaswini/anaconda3/envs/fastai/lib/python3.6/lib-dynload', '', '/home/tejaswini/.local/lib/python3.6/site-packages', '/home/tejaswini/anaconda3/envs/fastai/lib/python3.6/site-packages', '/home/tejaswini/anaconda3/envs/fastai/lib/python3.6/site-packages/IPython/extensions', '/home/tejaswini/.ipython']

Output of python -m fastai.utils.show_install

=== Software === 
python        : 3.6.8
fastai        : 1.0.46
fastprogress  : 0.1.20
torch         : 1.0.1.post2
nvidia driver : 415.27
torch cuda    : 9.0.176 / is available
torch cudnn   : 7402 / is enabled

=== Hardware === 
nvidia gpus   : 1
torch devices : 1
  - gpu0      : 4042MB | GeForce MX150

=== Environment === 
platform      : Linux-4.15.0-45-generic-x86_64-with-debian-buster-sid
distro        : Ubuntu 18.04 bionic
conda env     : fastai
python        : /home/tejaswini/anaconda3/envs/fastai/bin/python
sys.path      : 
/home/tejaswini/anaconda3/envs/fastai/lib/python36.zip
/home/tejaswini/anaconda3/envs/fastai/lib/python3.6
/home/tejaswini/anaconda3/envs/fastai/lib/python3.6/lib-dynload
/home/tejaswini/.local/lib/python3.6/site-packages
/home/tejaswini/anaconda3/envs/fastai/lib/python3.6/site-packages

I’ve seen other posts on this thread reporting a similar problem, but I don’t see a solution that has helped me so far. Can someone please help me with this?


(Stas Bekman) #316

This is the main clue. It most likely means that you have fastai-0.7 installed somewhere and it gets picked before fastai-1.0. Otherwise, it should have failed the first import, but it didn’t.

Please do from your notebook:

import sys
print(sys.modules['fastai'])

and check what’s at that path. Most likely it’s fastai from fastai-0.7.


(Tejaswini M) #317

Yep, that was it. Thanks @stas!
I get that error only when I run my notebook from courses/dl1, in which case the output is

<module 'fastai' from '/home/tejaswini/fastai/courses/dl1/fastai/__init__.py'>

which is fastai 0.7.

From anywhere else, it gets fastai from the sys.path.

<module 'fastai' from '/home/tejaswini/anaconda3/envs/fastai/lib/python3.6/site-packages/fastai/__init__.py'>

Thanks again.


(Stas Bekman) #318

Excellent.

That’s the symlink issue in that folder. Hopefully by summer once the new part2 is done we will be able to finally move those nbs into a different repo.

And each notebook under courses/*/ starts with “do not run these notebooks under fastai-1.0” and each folder has 00-DO-NOT-USE-WITH-FASTAI-1.0.x.txt

I will also add a note to https://docs.fast.ai/troubleshoot.html#modulenotfounderror-no-module-named-fastaivision to make sure the user is not under courses/*/.


(Tejaswini M) #319

I will also add a note to https://docs.fast.ai/troubleshoot.html#modulenotfounderror-no-module-named-fastaivision to make sure the user is not under courses/*/ .

Yes, that will be helpful. Thanks!


(Siddhertha Basak) #320

Hi,
I am trying to run the fastai video tutorial demos in my local *Windows* system. I am not able to import the fastai libraries in my Jupyter notebook. When I am trying to download the fastiai libraries using pip installation manager I am getting the below mention error : "
“Command “python setup.py egg_info” failed with error code 1 in c:\users\basaks\appdata\local\temp\pip-install-qdpnqe\torch”

If anybody help me to setup the fastai library into my windows system jupyter notebook.


(Stas Bekman) #321

It is very difficult to make sense of part of an error in any bug report. Please always submit the full traceback - then we can help.

Until then you can also use google:

https://www.google.com/search?q=command+python+setup.py+egg_info+failed+with+error+code+1+in+torch&ie=utf-8&oe=utf-8

seems like torch related issue. i.e. your pip is failing to install torch, so leave fastai aside first and figure out how to install torch.


#322

After a error happened when I install in this way conda install -n fastai10py36 pytorch torchvision cudatoolkit=8.0 pytorch, I tested and installed PyTorch successfully in the following ways:

pip install https://download.pytorch.org/whl/cu80/torch-1.0.1-cp36-cp36m-win_amd64.whl
pip install torchvision

Then I try to install with pip install fastai, a error comes below

build\temp.win-amd64-3.6\Release\bottleneck/src/reduce.obj : fatal error LNK1000: Internal error during IMAGE::Pass1
 
 error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio 14.0\\VC\\BIN\\x86_amd64\\link.exe' failed with exit status 1000
​
  ----------------------------------------
  Failed building wheel for bottleneck
  Running setup.py clean for bottleneck
  Building wheel for ujson (setup.py) ... error
  
  ----------------------------------------
  Failed building wheel for ujson
  Running setup.py clean for ujson
  Building wheel for cytoolz (setup.py) ... error

and I try some suggestions from forums.fast.ai, but still cannot work.

I hope to get some help from you, thank you!


(Stas Bekman) #323

First, you’re not installing pytorch correctly with conda, you’re missing -c pytorch. Follow the exact instructions at https://pytorch.org/get-started/locally/

Second, your system is not setup with compiler tools, so all those pip packages fail to build. If you use conda you won’t have this problem, as they are all binary already.

Follow the exact fastai conda install instructions: https://github.com/fastai/fastai/blob/master/README.md#conda-install and everything will work.


#324

I have a configuration with Python36+cuda80+vs2015

I tried to install in the recommended way to install PyTorch before, but a error would come when I installed PyTorch as follows:

[Errno 13] Permission denied: 'D:\\anaconda\\anaconda-install\\pkgs\\pytorch-1.0.1-py3.6_cuda80_cudnn7_1\\info\\recipe\\bld.bat'

(Madhurjya Roy) #325

There is an issue with running fastai on Windows, which is caused by PyTorch 1.0.1. (https://github.com/pytorch/pytorch/issues/17108). Pytorch 1.0.0 works fine, though. Might be worth putting in the troubleshooting guide, asking people to explicity install pytorch=1.0.0 if they get the associated error on their Windows setup.


#326

Thanks a lot!
The following script works well!

# configuration with Python36+cuda80+vs2015
conda create -n <env_name> python=3.6
conda activate <env_name>
python --version
# in China, many people recommend to install not from original source, so we remove the `-c` before final `pytorch`
conda install pytorch==1.0.0 torchvision cudatoolkit=8.0 pytorch
conda install -c fastai fastai

(Arkadiusz Bicz) #327

Is it dropped support for only cpu training in latest fast ai (v. 1.0.48)?

It looks from installation that latest version of fast ai supporting pytorch-cpu is 1.0.34


(Stas Bekman) #328

Please feel free to submit a PR, @mroy. BTW, skimming through that issue it appears that it has been fixed and is just waiting for the next release:
https://github.com/pytorch/pytorch/issues/17108#issuecomment-465793918
So perhaps the nightly build can be a 2nd alternative to pytorch-1.0.0 that has issues too.