Just compiled pytorch w/ cuda 10. Logs showed compilation of 7.0 (volta) compute capabilities and not 7.5 (turing). Not sure if this will bite me later or not…
log of python -c 'from fastai import *;show_info(0)' is:
distro info : Ubuntu 16.04 Xenial Xerus
python version : 3.6.3
fastai version : 1.0.5
torch version : 1.0.0a0+805f4d5
nvidia driver : 410.57
cuda version : 10.0.130
cudnn version : 7301
cudnn available: True
torch gpu count: 2
[gpu0]
name : GeForce RTX 2080 Ti
total memory : 10989MB
[gpu1]
name : GeForce RTX 2080 Ti
total memory : 10981MB
Does it work for all kinds of tasks? I mean, it should give a speedup for various kinds of models, and not only to image recognition architectures, right?
If you use nn.DataParallel then effectively you get double the memory. You should use NVLink to get good performance, of course (which requires a 2080 or better IIRC).
Do you have some kind of comparison table? Or some kind of benchmark to test?
I knew that you had nice record there, but I missed details of the implementation, if you somehow make fp16 twice as fast as fp32 than it really changes my video-cards comparison
1080Ti would be my recommendation here. Memory is one of the more important aspects to deep learning, and while there are ways around the limitation now it’s still complex. 11gigs of memory is a much better footprint over the 8gigs of the 2080, especially when you’re also running the graphics off that card as well.
I still got to understand how much memory a rtx card spares when it runs computations in mixed precision, both for vision and for structured data. That is, how much of that mixed precision computation is actually done in FP16.
I’m seeing 2070 as low as $499 at EVGA so you could get 2 8gb fp16 gpus cheaper than the single 11gb 2080ti. The 2070s do not have NVlink, but I don’t know if that would add enough benefit to offset the increase in power/memory at a lower price.
Hm, I see. Yes, I am thinking right now what is more beneficial: GPU with fp16 computations support, or 11GB of memory. Haven’t thought that the answer would be so difficult
I’m interested in this answer as well. @sgugger has an excellent post on single precision vs mixed precision. I just posted a question on the Mixed Precision thread asking him what memory usage he’s seen in practice on the mixed precision work he’s done.
Hi Ilia (hope name is ok). I bought one last week in preparation for Jeremy’s new course tomorrow. I HATE games (grumpy old man). I will be looking to advise you numerically, as the machine that I am ‘uograding’ is old i7-9?? in lga1366, x58 board. I’ll be incrementally plugging in Gigabyte 2080Ti oc, 970evo vnand ssd, and 2Tb 860 vnand pro: should be fun and instructive. Will advise.
Cheers, peter kelly.