Use multiple GPUs in FastAI Course v3 - 2019?

Hi all,

I’ve spent a number of months building a
workstation for machine learning. Some of the
posts I read talked about multiple GPUs.

So I bought and installed two GPUs in my motherboard.

nvidia-smi --list-gpus
GPU 0: GeForce GTX 1060 6GB (UUID: …)
GPU 1: GeForce GTX 1060 6GB (UUID: …)

I’m finally getting started on Lesson 1 of FastAI 2019

The initial code with RESNET34 worked

But skipping over the RESNET34 code, and
using RESNET50 code, I got the error:

RuntimeError: CUDA out of memory.
Tried to allocate 2.00 MiB (GPU 0; 5.93 GiB total capacity;
4.57 GiB already allocated; 2.25 MiB free; 91.03 MiB cached)

nvidia-smi
Sat Aug 29 19:42:56 2020
±----------------------------------------------------------------------------+
| NVIDIA-SMI 440.100 Driver Version: 440.100 CUDA Version: 10.2 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 106… Off | 00000000:03:00.0 On | N/A |
| 9% 58C P8 11W / 120W | 5969MiB / 6075MiB | 0% Default |
±------------------------------±---------------------±---------------------+
| 1 GeForce GTX 106… Off | 00000000:04:00.0 Off | N/A |
| 0% 38C P8 4W / 120W | 12MiB / 6078MiB | 0% Default |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1058 G /usr/lib/xorg/Xorg 35MiB |
| 0 1671 G /usr/lib/xorg/Xorg 390MiB |
| 0 1901 G cinnamon 116MiB |
| 0 2467 C /usr/lib/libreoffice/program/soffice.bin 63MiB |
| 0 2543 G …AAAAAAAAAAAACAAAAAAAAAA= --shared-files 156MiB |
| 0 3750 C /home/oracle/anaconda3/bin/python 5193MiB |
±----------------------------------------------------------------------------+

The code didn’t use the second GPU at all.

I was doing some reading at:

All this webpages mentions is:
Order of GPUs
Not how to use multiple GPUs

and

How to use Multiple GPUs?

Part 1 (2018)

Posts from 2017 to early 2019

With that version, some found that
it might have been beneficial
to work with 2, but not 3 or 4 GPUs

Q1:
For FastAI Course v3 - 2019,
was a solution found to use two GPUs?

Q2:
For Course v4 - 2020 (Part 1) → fastai v2
was a multi GPU solution found?

If so, please send the links.

Thanks a lot

Hey, using multiple GPUs usually refers to either running multiple experiments in parallel (where every experiment is running on 1 GPU) or running batches on multiple GPUs at the same time. However, your GPUs are old and don’t have much RAM and so you’re running out of GPU memory when trying to train RESNET50 - the 2nd GPU won’t be able to help you there since the entire model has to be placed on the same GPU when using fastai.

I found this thread from October 2018

Some people got multiple GPUs to work

Then had issues saving the model

How to use multiple gpus

Looks like fastai library v2 has
libraries to take advantage of multiple GPUs

fastgpu:
… If more than one GPU is available, multiple scripts are run in parallel, one per GPU.

fastgpu library:

There is also this documentation:

Distributed and parallel training

Although it doesn’t state the
version of the fastai library
it applies to

:frowning:

Just in case you want to use multiple GPUs for inference you can split up the work and then load the mode on dedicated GPUs as follows:
fastai_learner = load_learner(self.model_directory + ‘/’ + Train.TRAINED_MODEL_FILE_NAME, cpu=True) #cpu=True, to avoid that we load it to GPU 0
self.fastai_learner.model.cuda(gpu_id) #Load it do the GPU definde by gpu_id

HI,

I have tried following the instructions on this documentation page, however, for step two it states

Run configure_accelerate from the command line, however, it doesn’t say where that CLI comes from.
Is that a CUDA command?

Thank you in advance,

Jon