After connecting to SSH to ubuntu VM(DL VM is hosted in azure cloud and using MS DL VM ) through GIT Bash terminal …
I followed below steps…
Now, move into a directory where you are comfortable installing the Fastai repo, with its libraries and required packages.I did with default directory so default command i am using …
Now you got to clone that repo as follows:
git clone https://github.com/fastai/fastai
Once the cloning process finishes, be sure to be in the directory created by git for the Fastai repository, and type:
conda env create -f fastai/environment.yml
Having tried all of the steps outlived here and in other posts, I have created the fastai source and environment on my new NC6 instance.
To create the environment:
~/fastai$ conda env create -f ~/fastai/environment.yml
To activate it:
source activate fastai
But it doesn’t really work for me.
The first issue is that when I use the URL supplied when I invoke jupyter notebook I end up in my local system - not on the Azure box.
The second thing is I tried import torch from within ipython on the NC6 system, and I got an error (below) - so there must be some missing dependencies - but I am not sure how to tackle that!
Do you have any suggestions for either or both errors?
Python 3.6.4 |Anaconda, Inc.| (default, Mar 13 2018, 01:15:57)
Type 'copyright', 'credits' or 'license' for more information
IPython 6.2.1 -- An enhanced Interactive Python. Type '?' for help.
In [1]: import torch
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-1-eb42ca6e4af3> in <module>()
----> 1 import torch
~/.conda/envs/fastai/lib/python3.6/site-packages/torch/__init__.py in <module>()
54 except ImportError:
55 pass
---> 56 from torch._C import *
57
58 __all__ += [name for name in dir(_C)
ImportError: /opt/intel/mkl/lib/intel64/libmkl_gf_lp64.so: undefined symbol: mkl_lapack_ao_ssyrdb
Related to this issue
“The first issue is that when I use the URL supplied when I invoke jupyter notebook I end up in my local system - not on the Azure box”
You are connecting to "linux vm hosted in azure cloud and to connect the VM machine either u can connect through remotely or through a Bash Shell window like cygwin/Git Bash from windows local machine.
Here are u inside Azure DL VM machine or you are connecting through the linux Bash command tool window?
If your are cnonecting to the VM for Linux Bash command window like CygWin/Git Bsh Shell window then you will open the link in your local browser but u will use the GPU of the Azure DL cloud VM.
For second pytorch is not installed corretcly perhaps…
SO install the pytorch
I’ve connected through the Windows Linux subsystem Bash window.
I’ve tried the recommended approach ssh -L 8888:127.0.0.1:8888 myserverpaddress
I also tried without the -L 8888:127.0.0.1:8888
Once there, I start jupyter notebook --no-browser, and copy and paste the URL I am given into my browser (on my local PC). I just end up looking at the currently running notebooks on local PC, which also are on port 8888. I want to see the notebooks that are on the server.
For the pytorch, I thought that setting up the fastai environment should install that correctly. It starts, but is seems to need something it hasn’t got…
Perhaps something is broken in the latest Pytorch?
But when I look at the pytorch package I have this:
pytorch: 0.3.1-py36_cuda9.0.176_cudnn7.0.5_2 pytorch [cuda90]
Can I safely upgrade the CUDA, or is that fixed on the VM? Otherwise, I can step back to an earlier version of Pytorch, but I know it will handle some things differently, which is possibly an issue with the fastai programs which expect CUDA 90.
Regarding the notebooks - here are the ports that are open - perhaps I need to add 8888?
Can you try closing your local server running on 8888 and then connect to the instance? I guess your request from the browser is first being served by local jupyter server.
Regarding pytorch, I have tried 10 days back and it’s working fine. Let me check once.
I got connection timed out issue when i use Office VPN like fortis client .So check any virus scanner or VPN is blocking the same.After that i shutdown the VPN and connect to my local broadband(LAN connection) and worked fine.
Related to pytorch issue with CUDA compatibility issue may be let me check any forum has answers for that issue.
I was in the fastai directory when I issued conda env create -f ~/fastai/environment.yml, so to use a path I had to give it relative to home. I can get up into home and redo it, but I wouldn’t have thought it should make much difference - in fact I suspect just doing it without the path information should work if I am in the directory that contains the yml file?
I don’t believe I have an environment issue, source activate fastai works just fine, I will try importing other libraries and see how things go…
OK. I started IPython and executed the lines of imports.py one by one. They were all OK except for the the import of seaborn (see below). But that indicates to me that the environment is OK, its just pytorch that isn’t.
import seaborn as sns
QXcbConnection: Could not connect to display
Aborted (core dumped)
Found a reference to this which says that in an ipython environmnent you may have to use os.environment, and this worked for me:
import os
os.environ['QT_QPA_PLATFORM']='offscreen'
import seaborn as sns
No error!
I will see if I can find advice on upgrading the VM to have CUDA9, failing that I will downgrade to a CUDA8 version of Pytorch.
I have upgraded to CUDA 9, but unfortunately I still get the problem:
Chris_Palmer@FASTAI:~$ cat /usr/local/cuda/version.txt
CUDA Version 9.0.176
Chris_Palmer@FASTAI:~$ source activate fastai
(fastai) Chris_Palmer@FASTAI:~$ ipython
Python 3.6.4 |Anaconda, Inc.| (default, Mar 13 2018, 01:15:57)
Type 'copyright', 'credits' or 'license' for more information
IPython 6.2.1 -- An enhanced Interactive Python. Type '?' for help.
In [1]: import torch
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-1-eb42ca6e4af3> in <module>()
----> 1 import torch
~/.conda/envs/fastai/lib/python3.6/site-packages/torch/__init__.py in <module>()
54 except ImportError:
55 pass
---> 56 from torch._C import *
57
58 __all__ += [name for name in dir(_C)
ImportError: /opt/intel/mkl/lib/intel64/libmkl_gf_lp64.so: undefined symbol: mkl_lapack_ao_ssyrdb