GTX 2080/2080Ti RTX for Deep Learning?

1080Ti would be my recommendation here. Memory is one of the more important aspects to deep learning, and while there are ways around the limitation now it’s still complex. 11gigs of memory is a much better footprint over the 8gigs of the 2080, especially when you’re also running the graphics off that card as well.

regards
Even

1 Like

This is the updated imagenet training repo https://github.com/diux-dev/imagenet18

4 Likes

I still got to understand how much memory a rtx card spares when it runs computations in mixed precision, both for vision and for structured data. That is, how much of that mixed precision computation is actually done in FP16.

I’m seeing 2070 as low as $499 at EVGA so you could get 2 8gb fp16 gpus cheaper than the single 11gb 2080ti. The 2070s do not have NVlink, but I don’t know if that would add enough benefit to offset the increase in power/memory at a lower price.

Hm, I see. Yes, I am thinking right now what is more beneficial: GPU with fp16 computations support, or 11GB of memory. Haven’t thought that the answer would be so difficult :sweat_smile:

I’m interested in this answer as well. @sgugger has an excellent post on single precision vs mixed precision. I just posted a question on the Mixed Precision thread asking him what memory usage he’s seen in practice on the mixed precision work he’s done.

1 Like

Hi Ilia (hope name is ok). I bought one last week in preparation for Jeremy’s new course tomorrow. I HATE games (grumpy old man). I will be looking to advise you numerically, as the machine that I am ‘uograding’ is old i7-9?? in lga1366, x58 board. I’ll be incrementally plugging in Gigabyte 2080Ti oc, 970evo vnand ssd, and 2Tb 860 vnand pro: should be fun and instructive. Will advise.
Cheers, peter kelly.

1 Like

I’d have missed that !! Thanks Jeremy.
P

1 Like

Hi Peter!

That’s great! Would really appreciate if you could carry out a benchmark. As I can understand, you definitely should get a speedup with conv nets, and probably the possibility to use bigger mini-batches.

Hi Ilia (is 'Ilia’ok? Or should I call you Devforfu -I am not good at yhis and too old to learn, please advise). Thanks for reply. Yep, the motherboard is also old -PCIe 2.0 - rs, but the new spec (4.0) ones are due in 3-4 months, AND 5.0 spec is about to be released ! So when first quality boards AND micros to suit??
I decided to stick with old motherboard and take each step with it for gpu 2080Ti, nvme, nvmessd, then keap to major expense of motherboard, ram,cpu and more stuff in next $warp.
Perhaps I’m naieve, but at least I can watch!
Cheers & stay cool and faid (new term of trade - FastAI’d),
p

Ok, understood!

I didn’t even know that new architectures support half-precision arithmetic, and don’t know how to apply this technique, except using fastai library. Because as can see from the source code, fp16 training is not a very straightforward process. However, an idea to have the card that supports this feature sounds very attractive.

P.S. Yes, sure, Ilia is fine =)

If you’re something that can take advantage of fp16 instead of 32, won’t the 2080 effectively have more memory?

Probably not, because we need to keep an fp32 copy of the weights too - it’s actually mixed precision training. I haven’t actually tested this however.

4 Likes

@jeremy Would you advise that it is still worth to choose architecture with fp16 support then 10xx one? I mean, from the perspective of the next 1-2 years? Or is it fine for now to stick with “proven” hardware and pick the previous generation? From your personal point of view, of course =)

Though I think probably my question is a bit too vague to answer right now without having appropriate tests, benchmarks, and actual experiments. Seems that mixed precision training is not a simple-to-answer question.

I’d get a 2080ti for sure. The tensor cores are well proven now.

3 Likes

Incase any one else wants a quick refresher on Mixed precision training, I found This thread to be great

1 Like

I managed to get it working after abandoning cuda 10 and moving to a later 410 release. I’m on 410 release working with cuda 9.2 (not 10) and pytorch/fastai dailies, ubuntu 18.04.

I get an error with fit and fit_one_cycle(). (e.g. even on Part1v3 Lesson 1 Pets) cuda runtime error (11) : invalid argument at /opt/conda/conda-bld/pytorch-nightly_1540205010643/work/aten/src/THC/THCGeneral.cpp:421 (looks like it is allocating memory?).

Weirdly the workaround is to run just run fit or fit_one_cycle again and it works the second time. Hope that helps someone else.

Has anyone tried the 2080 ti yet and come up with this issue? Or performed comparative performance tests with the 2080 ti under fastai?

FWIW here are some quick benchmarks showing the speed increase vs titan xp and 1080 ti with fastai. I suppose cuda 10 will speed things up further once pytorch officially supports it. https://forums.fast.ai/t/lesson-1-pets-benchmarks/27681/2?u=digitalspecialists

3 Likes

not to distract you guys from the discussion, but i found this interesting what you can do with single 1080 and in fact you can win kaggle image competition. congrats to b.e.s. phalanx !

GPU resources

  • I had only single 1080
  • phalanx had single 1080Ti and got another one only during the last week of competition
3 Likes

I am surprised that, it was their first image segmentation problem. They knew nothing about image segmentation 3 months ago. Great !

However, I guess they knew at least something about Machine Learning and Data Science competitions before the competition :smile: