GTX 2080/2080Ti RTX for Deep Learning?

If you’re something that can take advantage of fp16 instead of 32, won’t the 2080 effectively have more memory?

Probably not, because we need to keep an fp32 copy of the weights too - it’s actually mixed precision training. I haven’t actually tested this however.

4 Likes

@jeremy Would you advise that it is still worth to choose architecture with fp16 support then 10xx one? I mean, from the perspective of the next 1-2 years? Or is it fine for now to stick with “proven” hardware and pick the previous generation? From your personal point of view, of course =)

Though I think probably my question is a bit too vague to answer right now without having appropriate tests, benchmarks, and actual experiments. Seems that mixed precision training is not a simple-to-answer question.

I’d get a 2080ti for sure. The tensor cores are well proven now.

3 Likes

Incase any one else wants a quick refresher on Mixed precision training, I found This thread to be great

1 Like

I managed to get it working after abandoning cuda 10 and moving to a later 410 release. I’m on 410 release working with cuda 9.2 (not 10) and pytorch/fastai dailies, ubuntu 18.04.

I get an error with fit and fit_one_cycle(). (e.g. even on Part1v3 Lesson 1 Pets) cuda runtime error (11) : invalid argument at /opt/conda/conda-bld/pytorch-nightly_1540205010643/work/aten/src/THC/THCGeneral.cpp:421 (looks like it is allocating memory?).

Weirdly the workaround is to run just run fit or fit_one_cycle again and it works the second time. Hope that helps someone else.

Has anyone tried the 2080 ti yet and come up with this issue? Or performed comparative performance tests with the 2080 ti under fastai?

FWIW here are some quick benchmarks showing the speed increase vs titan xp and 1080 ti with fastai. I suppose cuda 10 will speed things up further once pytorch officially supports it. https://forums.fast.ai/t/lesson-1-pets-benchmarks/27681/2?u=digitalspecialists

3 Likes

not to distract you guys from the discussion, but i found this interesting what you can do with single 1080 and in fact you can win kaggle image competition. congrats to b.e.s. phalanx !

GPU resources

  • I had only single 1080
  • phalanx had single 1080Ti and got another one only during the last week of competition
3 Likes

I am surprised that, it was their first image segmentation problem. They knew nothing about image segmentation 3 months ago. Great !

However, I guess they knew at least something about Machine Learning and Data Science competitions before the competition :smile:

I wonder if that NVlink is really necessary. I doubt the PCIe bus would be a bottleneck with just two cards.

In that case, and if one plans to leverage parallelism, two 2070 could be cheaper and better than a single 2080ti, in particular when it comes to memory.

I just placed an order for an Asus RTX 2070 8Go Turbo (blower fan), to install next to my Asus 1080Ti 11Go Turbo with a Ryzen 1700x + Samsung SSD 1To.

So I hope to do some test with Fastai in the coming days, trying the mixed 16/32 precision with its TensorCores (the 1080Ti can’t), most likely using @radek starter pack for the Quick Draw competition on Kaggle.

According to http://on-demand.gputechconf.com/gtc/2018/video/S81012/ (~ at 09 mins), where a lead Nvidia engineer for PyTorch presents a real case study, “using multiple of 8” is critical :sunglasses:

4 Likes

Mh, I wonder how many amongst the existing best models does satisfy that requirement.

2 Likes

2070 most cost effective as per tim dettmers http://timdettmers.com/2018/11/05/which-gpu-for-deep-learning/

used 1080Ti is about the same price as new 2070. is 2070 still better choice than 1080 ti even for the same price? (pre-owned vs new though).

1 Like

That’s the idea, as Tim favors the RTX 2070 as the “best value GFX for DL, and Kaggle” today.
Even mentioning in the comments that 2* 2070 might be better, while cheaper, than a single 2080Ti for most users as it allows faster exploration of training (pix_size, models, # of epochs, etc.).
Use Ctrl-F + 2070 to zoom into those nuggets.

I’m curious to see how the cheapest TensorCores consumer GFX at €550 compares with the previous “King of the Kill” of the 10xx line-up (I bought mine refurbished for €700 in April 2017).

In Sweden, last copies of 1080Ti’s now retail for €950, while new 2080’s for €900 and 2080Ti’s for €1,300.

If you want to activate FP16 with fastai, you add the command to_fp16() when you create your learner, not when you run it.

As in learn = create_cnn(data, models.resnet34, metrics=error_rate).to_fp16()

edit: this was done on a 1080Ti, prior to receiving my RTX 2070. It doesn’t work for the 2070 as it crashes my kernel.

1 Like

Are more changes needed to make to_fp16() work ?
I tried:

learn = create_cnn(data, models.resnet34, metrics=accuracy, model_dir='.models').to_fp16()
learn.fit_one_cycle(1)

and it fails with

RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR

but without the to_fp16() it works fine…

Try running the cell again after it errors. If that fails, try running the cell (getting the error), running learn.model.cuda(), then running it again.

I’ve run into several weird cuda/cudnn errors that seem to be solved by just running the cell again. It’s like things don’t work out the first time you try to run a model, but after that everything’s fine.

I was using learn = to_fp16(…) rather than (…).to_fp16. See if that works for you.

I’m on the 410 driver as well. I usually get a CUDA error 11 the first time I ‘learn.fit_one_cycle(1)’ but it works without a hitch when I re-run the cell.

Hi ilia, with my tail between my hind legs and head bowed in shame I must confess that I have rendered my ‘upgrade’ and planned successive performance steps USULESS, for now. Suffice to say that I can’t even boot from my poor old ( formerly) trusty tower anymore. I do have an Asus G75V with 2T of ssd and have been doing The Master’s (Jeremy’s) course on it. Believe me Ilia, I am working on it; I am even reading the installation instructions for the Samsung 2T vnand 860 pro and the m.2 970 evo. The Gigabyte 2080 Ti oc looks GREAT, good for photos. When I do get all working, AI will probably still be popular.
Is there a general thread where I can sing Jeremy’s (and Rachael’s) praises? I think thus should be done from the global AI rooftops. He , virtually alone, has revived AI to a fabulous state, translated it from ‘nerd-speak’, worked like a deamon in proving himself and his methods (non-confirmist, and for good reason), wrestled the AI from the strict realms of the academics, read heaps of papers and sorted them, cut through the in-breed jargon, prepared superbly understandable lessons and videos into THE premier AI course and presents it FREE for us unclean masses in the fabulous MOOC. WHAT A GREAT GUY! And, of course, he is Australian, AND from Melbourne. Thanks also to the San Francisco University for supporting him in this Globally Disruptive Technological thrust. This will inspire global changes to many things, you ain’t seen nothin’ yet.
I’ll try to keep you informed if any significant progress on my little desktop Ilia.
Cheers for now,
Peter Kelly

I just installed the RTX 2070, it works fine in standard precision but when I try the .to_fp16(), or “leanr= to_fp16(create_cnn(…))”, it crashes my kernel without an error message in Jupyter Notebook.
So rather hard to debug :slight_smile:

I’m using Ubuntu 16.04 and nvidia 410.73.

Strangely the .to_fp16() command works with my 1080Ti without crashing the kernel (but no performance boost), thus my post yesterday https://forums.fast.ai/t/gtx-2080-2080ti-rtx-for-deep-learning/26783/60?u=ericpb