I have a GCP instance:
Here is my installed library versions:
from fastai.utils import *
show_install()
=== Software ===
python version : 3.7.0
fastai version : 1.0.30
torch version : 1.0.0.dev20181120
nvidia driver : 410.72
torch cuda ver : 9.2.148
torch cuda is : available
torch cudnn ver : 7401
torch cudnn is : enabled
=== Hardware ===
nvidia gpus : 2
torch available : 1
- gpu0 : 16130MB | Tesla V100-SXM2-16GB
- gpu1 : 16130MB | Tesla V100-SXM2-16GB
=== Environment ===
platform : Linux-4.9.0-8-amd64-x86_64-with-debian-9.6
distro : #1 SMP Debian 4.9.130-2 (2018-10-27)
conda env : base
python : /opt/anaconda3/bin/python
sys.path :
/home/jupyter/fastai-course-v3/nbs/dl1
/opt/anaconda3/lib/python37.zip
/opt/anaconda3/lib/python3.7
/opt/anaconda3/lib/python3.7/lib-dynload
/opt/anaconda3/lib/python3.7/site-packages
/opt/anaconda3/lib/python3.7/site-packages/IPython/extensions
/home/jupyter/.ipython
Hardware specs:
OS: Debian 4.9.130-2 (2018-10-27)
RAM: 52GB
CPU: 8 vCPU (skylake)
HD: 200 GB hdd
GPU: V100 x 2
Benchmarks:
Training: resnet34
learn.fit_one_cycle(4): Total time: 01:47 (single gpu)
learn.fit_one_cycle(4): Total time: 01:56 (dual gpu)
after Unfreezing, fine-tuning, and learning rates
learn.fit_one_cycle(1): Total time: 00:27 (single gpu)
learn.fit_one_cycle(1): Total time: 00:27 (dual gpu)
learn.fit_one_cycle(2, max_lr=slice(1e-6,1e-4)): Total time: 00:53 (single gpu)
learn.fit_one_cycle(2, max_lr=slice(1e-6,1e-4)): Total time: 00:54 (dual gpu)
Training: resnet50
learn.fit_one_cycle(5): Total time: 03:11 (single gpu)
learn.fit_one_cycle(5): Total time: 03:16 (dual gpu)
after Unfreeze:
learn.fit_one_cycle(1, max_lr=slice(1e-6,1e-4)): Total time: 00:44 (single gpu)
learn.fit_one_cycle(1, max_lr=slice(1e-6,1e-4)): Total time: 00:41 (dual gpu)
As you can see in this example, running multiple gpus for resnet34 did not improve performance. It performed about the same as a single.
P.S. : I run the notebook “As-is”. For a single gpu, I change nothing. To test dual gpu, I simply added “learn.model = torch.nn.DataParallel(learn.model, device_ids=[0, 1])” before fitting.