Hi, I have been following fastai v2 course (completed 2 weeks) since december '18. I am training locally on my machine and not on a server. It took me around a week of multiple CUDA and CuDNN downloads, scouring multiple websites(thanks to my antivirus) and this forum to get my Windows 10 machine set up and working(I am new to all this so, please don’t judge me!). So, I would really like to know a head-to-head comparision in v2 and v3.
Also, Should I choose v3, what updates do i need to make to my desktop environment or do i have to re-install fastai all over again?
As we may expect from fastai-everything has changed and ofcourse many new ideas are shared by Jeremy.
I’d highly recommend dual booting instead of installing everything on windows. That way you can spend more time on learning rather than dealing with painful CUDA installations.
The setup instructions for linux are there in the fastai docs
For dual boot tutorial, here is one that I had written
re: whether you should upgrade–personal choice and I’ve only been through Lesson 1 of v3 thus far…however, I’d say that since you’re only 2 weeks in, it’s worth considering if you can get your environment upgraded with less effort than your first pass.
FWIW, I tried a simple upgrade of my v2 environment (run on fastai v0.7 and pytorch v0.4) to support v3 (run on fastai v1.0 and pytorch v1.0) and couldn’t get it working.
The solution was to create a new
conda env from scratch. Here is my approach in the terminal:
$ nvidia-smi # to get cuda version, e.g. 10.0
$ conda update conda
$ conda search cuda -c pytorch # to find matching package, e.g. cuda100
$ conda create -n fastai3 # don't specify a Python version here!
$ conda activate fastai3
$ conda install -c pytorch -c fastai fastai pytorch torchvision cuda100 # change to match appropriate cuda pkg
After rebooting, this procedure resulted in a working
Note: I’m running miniconda3 on Ubuntu 18.04 LTS on a Dell XPS 15 (dual-boot with Windows 10) with an NVIDIA GeForce GTX 1050 with CUDA v10.0.
As google colab now supports fastai and no installation is needed . So, i would recommend to do v3 using colab instead of re-installing all over again.
Ps: i just completed v2 and have done first 2 lesson of v3 . V3 is looking quite different from v2 until now .
I just use different anaconda environments for the different versions. I haven’t run into issues.
I already have a dual boot setup (Win10/Ubuntu18.04)…thanks to pip atrocities in windows, although i had a hard time getting everything configured on ubuntu(i have tried that as well) the only reason windows wasnt working was my AV which was figured out.
I wasnt able to run “nvidia-smi” cmd…i tried all forums, it said drivers arent updated but as per the website, i didhave latest version
nvidia-smi didn’t work, then I suspect you downloaded and installed the drivers manually, no?
Do you know what version of CUDA you have installed on your Linux partition? If so, you can try just proceeding with the command list I supplied above. Again, it seems the consensus is that it’s easiest to proceed with separate envs for v2 and v3.
If you don’t know which version of CUDA you installed and you did it recently, you could try creating one env with
cuda100 for v10.0.
As a last resort, you could following the
Troubleshooting Guide for fastai
and attempt to reinstall your Nvidia drivers…or simply do as above and use Google Colab (disclaimer: I’ve heard not all users get full access to the GPU memory - some as low as 5% - haven’t tried it myself).
Creating different environments is the way to go. You do not need to install cuda or cudadnn unless you are compiling your own binaries. Pytorch has it built in already if you use their pre-made binaries. The version you can run will depend on the Nvidia Driver version you are using. New drivers will support CUDA 10.
Taken from NVIDIA:
Table 1. CUDA Toolkit and Compatible Driver Versions|
|CUDA 10.0 (10.0.130)|>= 410.48|
|CUDA 9.2 (9.2.88)|>= 396.26|
|CUDA 9.1 (9.1.85)|>= 390.46|
|CUDA 9.0 (9.0.76)|>= 384.81|
|CUDA 8.0 (8.0.61 GA2)|>= 375.26|
|CUDA 8.0 (8.0.44)|>= 367.48|
|CUDA 7.5 (7.5.16)|>= 352.31|
|CUDA 7.0 (7.0.28)|>= 346.46|
Getting the older Fast.ai up and running on your local machine may not be straight forward. You can see the steps here:
Ultimately (just my 2 cents), if you are just starting out, I would go with V3 of the course and the new libraries. In your own post you said you spent a week just trying to get the environment up and running. Do not let the software setup be a barrier to learning the ML side of things. You can always come back and brush up on your linux commands later. Good luck
- If you already tried installing the CudaDNN and CUDA toolkits then you may be better off reinstalling ubuntu. I broke my system many times trying to get those libraries working and ended up in dependency hell. It was faster to just reinstall then to clean out the package manager.
Yes, I had manually installed Nvidia drivers when I installed Ubuntu 18.04 into my system about half an year ago
Update: I went through some more forums, did a little bit of more digging and found out that not only the drivers but also the kernels need to be up to date for nvidia-smi to work. So, i went back to Ubuntu ad found out i was running an older linux kernel. I installed UKUU, got the new linux kernel version, booted in and typed the commands you mentioned, The nvidia-smi command started working, everything is up and running fine as of now, I will certainly upgrade to v3 as all the headache is not there anymore.
But, this leaves me with another question,
In UKUU, the application was showing linux kernels 4.15 and 4.18 both installed(Before any extra kernel installations). The forums say that ubuntu would boot with the latest version available. However, Whenever booted, it would choose 4.15 eventhough 4.18 was installed.
The above conclusion was drawn from two things:
- UKUU was showing 4.15 and 4.18 installed.
- uname returned 4.15.
I would like to know why this happens
And, about the full access part, i will certainly try that out and update what happens.
Thanks a lot!
Disclaimer: this is deviating considerably from what I imagine most fast.ai users will experience.
Sounds like something didn’t work properly when installing the dev kernel from UKUU. I have not used UKUU in the past, because in my understanding, it is intended for developers and not supported by Ubuntu in the sense that your routine updates/upgrades via the
apt package manager won’t touch your UKUU-installed kernel.
At this point,
sudo apt install linux-generic may help you install the latest kernel supported by your Ubuntu distro. The only caveat is - if your UKUU kernel actually did install correctly - grub may still pick that kernel during the boot sequence, so you may need to uninstall the UKUU kernel or start from scratch with a new Ubuntu install.
For the record, I’m running Ubuntu 18.04.1 LTS with the 4.15.0-44-generic Linux kernel, so I’m not convinced your problem was actually with the kernel in the first place.
Finally, if you’re new to Linux/Ubuntu, I suggest that you stick to the Ubuntu forums when trying to troubleshoot issues with your install. They tend to stay away from things like UKUU that can break your system if you don’t really know what you’re doing (which includes me, FWIW).