How to install RTX enabled fastai? (CUDA10)

I have two cards in my system, 1080Ti and 2080. I wonder is it possible to install pytorch and fastai with 4xx drivers and CUDA 10? As I can see, there is an instruction on how to install pytorch from sources. However, I am not sure if fastai and other things will work as expected in this setting.

Could someone advise how to better proceed in this case? Is it possible old 3xx driver for both cards and go with CUDA 9.2?

Hi , you need driver 410.xx

https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html

CUDA Toolkit	              Linux x86_64 Driver Version  Windows x86_64 Driver Version
CUDA 10.0.130               >= 410.48                    >= 411.31
CUDA 9.2 (9.2.148 Update 1) >= 396.37                    >= 398.26
CUDA 9.2 (9.2.88)	          >= 396.26                    >= 397.44
CUDA 9.1 (9.1.85)	          >= 390.46                    >= 391.29
CUDA 9.0 (9.0.76)	          >= 384.81                    >= 385.54
CUDA 8.0 (8.0.61 GA2)	      >= 375.26                    >= 376.51
CUDA 8.0 (8.0.44)	          >= 367.48                    >= 369.30
CUDA 7.5 (7.5.16)	          >= 352.31                    >= 353.66
CUDA 7.0 (7.0.28)	          >= 346.46                    >= 347.62

If you have been using Container this would be just trivial …
There is an alternative I can suggest.

Instead of install cuda from fastai environment you install it from deb verson. Then you will be free to use whatever cuda you like and use the specific version of PyTorch from the official web site.

Ok, got it! My problem is that as soon as I install 4xx driver, I can’t import torch anymore. The nvidia-smi tool shows both GPUs but torch stops working.

Yes, I am going to try your advice and install CUDA from deb. Do you think I should pick a None option from the configurator below? (When installing PyTorch after CUDA driver was installed from deb).

Well this happens because nvidia-smi that comes with fastai environment (inside cuda9.2 package) is tied to other driver version (this is not generic) then its conflict to your actual driver and nvidia-smi you have.

The only way I believe that will cost less troubles is if you comment the cuda9.2 and cudnn instalations and pytorch from the fast ai environment, then you manually install the driver 410.xx, cuda10.0, cudnn 7.xx and yet using the fastai environment you install pytoch by pip install … from the official site or you build your own for cuda 10.

1 Like

I can give you some ideia to try …

Wow you are in MAC… this will not work !!! Your only option is to find soem channel in conda that has pytorch and cuda9x or cuda10 build for anacodna that you can install … I will look for it.

I believe this is just a wrapper to use your local cuda driver on your machine and have the reference inside your Conda environment, because it install a 2k file

conda install -c fragcolor cuda10.0

fastai v1 doesn’t have any cuda deps in pip or conda - so you can use whatever you like.

2 Likes

No, actually I have an Ubuntu 18.04 desktop with GPU installed. Just using mac as a terminal =)

Ok, got it! So if I manage to install an appropriate driver, CUDA, and pytorch, then fastai will just work, I guess.

You mean, instead of following instructions from repo, install my own CUDA/cuDNN/nvidia-driver, right?

Ok, thank you for responses! I’ll make a try.

1 Like

Good to know… you can try to just update your driver then!
This will install “only driver 410.xx”

using the PPA and not touching anything else

sudo add-apt-repository ppa:graphics-drivers/ppa -y
sudo apt update -qq
sudo apt install -y --no-install-recommends cuda-drivers

Should I replace cuda-drivers with a specific version? Because the last command returns:

Reading package lists... Done
Building dependency tree
Reading state information... Done
E: Unable to locate package cuda-drivers

And, I already have the repository added.

1 Like

Hi, use this instead

sudo apt install --no-install-recommends \
      libcuda1-410 \
      libxnvctrl0 \
      nvidia-410 \
      nvidia-410-dev \
      nvidia-libopencl1-410 \
      nvidia-opencl-icd-410 \
      nvidia-settings

Just in case any here needs a fully CUDA 10 based PyTorch nightly build (including magma-cuda10) for Python 3.7 on Ubuntu 18.04, see here: https://vxlabs.com/2018/11/04/pytorch-1-0-preview-nov-4-2018-packages-with-full-cuda-10-support-for-your-ubuntu-18-04-x86_64-systems/

(in that post, I obviously use the new fastai documentation to test the fp16 callback :slight_smile:

I built this pytorch package because I have my eye on an RTX 2070.

4 Likes

Thanks for sharing @cpbotha

What magma version it is 2.4.0 or 2.4.1 ?

@willismar Thank you for your advice, yes, I just updated the driver using:

sudo apt install nvidia-driver-410

After that I have two recognized devices:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.73       Driver Version: 410.73       CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 108...  Off  | 00000000:01:00.0  On |                  N/A |
| 23%   32C    P8    12W / 250W |    110MiB / 11177MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce RTX 2080    Off  | 00000000:02:00.0 Off |                  N/A |
| 41%   29C    P8    17W / 225W |      0MiB /  7952MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

I was able to run fp16 training. However, need more tests to see if everything works as expected. I didn’t try to install a new version of PyTorch, or compile it from sources. So it uses CUDA 9.2, I guess.

@cpbotha Yes, I was looking for something like this. I guess I’ll try this approach soon.


By the way, it seems there is a little bug with fastai.show_install:

=== Hardware ===
nvidia gpus     : 2
torch available : 2
  - gpu0        : 11177MB | GeForce RTX 2080
  - gpu1        : 7952MB | GeForce GTX 1080 Ti

The devices names don’t match with their memory sizes. In my system, gpu1 is 1080Ti.

1 Like

I used magma 2.3.0, because that was the last version used by the https://github.com/pytorch/builder/ build scripts.

Does 2.4.x have improvements which justify a rebuild of PyTorch? (it takes a few hours in total on an i7 with SSD)

Hi again @devforfu

I believe you can list this as a Bug on Fast.ai so they can fix it. But if the SMI application is ok everything will be fine for you.

Btw you may have your fastai working but be aware that does not exist cuda9.xx or lower version to ubuntu 18.04 only cuda10 is available. As professor jeremy told us, using fastai v1.0 you can do whatever you want.

Hi
I compared other day teh source code of magma 2.3.0 and 2.4.0 on the context of the patch files and I did find out that almost all patches from pytorch build project for magma is already on the new version.

The only patches that someone may needs apply if will is the cmakelists.patch and thread_queue.patch to magma 2.4.0

I can show you if you need

@cpbotha Thanks a lot for your blog post, I’ll read it in more details tomorrow.

You don’t have access to the V3 section of the forum I understand, otherwise you’d see a post on the “RTX series + Fastai”, and some of us (especially me !) are struggling to get Mixed-Precision to run properly on either 2070 or 2080Ti.

The most annoying/surprising part is that Fp16 won’t run beyond a batch-size HALF of the Fp32 (248 max vs 512) on Fastai CIFAR10 notebook (the one in GitHub repo) while it should in theory run DOUBLE. It still runs 10% faster than FP32, despite half batch-size :triumph:

BTW, I posted a few runs of CIFAR10 with my 1080Ti, 2070 and 2080Ti here.

Durnit, I suspected there was a part of the forum I was not able to see. :frowning:

Interesting that batch size constraint. Did sgugger also take a look? Could it be due to the divisible-by-8 fp16-constraint? (would be strange, because there are too many clever people hanging out on this forum who would have diagnosed that first)