Recommendations on new 2 x RTX 3090 setup

tcapelle · September 28, 2020, 7:08am

Lot’s of guessing. I will wait for puget benchmarks. Is there official Nvidia benchmarks?

balnazzar · September 30, 2020, 3:23pm

Still no benchmarks related to something massive like a big transformer.

Also, note how the speedups are quite dependent upon the arch.

init_27 · September 30, 2020, 3:48pm

I have access to a RTX Titan now, along with a 2080Ti, once I get my hands on a 3090 (ETA, <1 week), I’ll put together a benchmark video

balnazzar · September 30, 2020, 4:07pm

If I may… Will Nvidia send one to you directly for benchmarking purposes?

init_27 · September 30, 2020, 5:07pm

Unfortunately, no!

I’ll be buying one, they have indeed sent me the Titan though so that saves me the $ of buying a second 24 gig GPU (It was supposed to reach me much earlier but the delays occurred due to the pandemic, I’m still v grateful though )

balnazzar · September 30, 2020, 6:34pm

A Titan is definitely better than nothing! Try and ask for the 3090 though, maybe they are interested in promoting benchmarks & reviews…

init_27 · September 30, 2020, 7:03pm

Oh yes, it’s amazing

I did try requesting one (In fact I did request for the Titan to be switched for a 3090 but the ship had quite literally sailed ) , but it might take a while. So I think I’ll get the first one myself

init_27 · October 1, 2020, 7:27am

Looks like I might be going Team Blue:

I’ve requested quotes for a 3960x build Vs a i9 10920X build, if the difference is <300$, I’d go team red otherwise I’d go for the Intel build, AFAIK, I don’t require the 12 cores still.

Do you all think it’d be beneficial to get the 3960x instead of the i9-10290x or i9-10980XE?

Those are the options that I’m currently looking at.

TIA!

tcapelle · October 1, 2020, 7:35am

let’s build a benchmark repo on top of fastai v2. (using the mixed precision training).
I have an 2070 super, a 2080ti and a quadro 8000 to test.
Let’s build our own pytorch/fastai based bench.
btw: I have a 3900X and it is pretty solid, but if you make heavy use of MKL intel may be the only way.
The 12 core CPU is enough for my 2 GPU setup.
Could be something like this: https://github.com/tcapelle/fastbench

balnazzar · October 1, 2020, 7:41am

It depends.

Pros:

best performance in its price range. It is vastly superior to the 10980XE, and even to the W-3275.
I don’t see any other pro.

Cons: everything else.

Game-ish motherboards. No workstation features (like IPMI).
Bad slot placement layout for almost all the mobos around, and just 4 full-length slots.
On top of that, no mobo in the EEB format, so the last slot is not really usable.
Almost no trx40 motherboard could handle two 3090 with at least one vacant slot in between.
Monstrous power draw, well above its 280W tdp.
Enormous heat generation. They are difficult to cool even with a 360 radiator. Air not viable.
Still weak at MKL-related tasks.

Want to go with AMD? Buy an Epyc 7282 (16 cores, it’s basically a 3950X with 128 lanes and four memory channels). It’s half the price of the 3960X, it draws much less power, and it’s still above the 10980XE. Mind, however, that no Epyc motherboard does support sleep/hibernation as of yet.

Just for disclosure purposes, I bought a Supermicro X11SPA (note how flexible its slot layout is), and a Xeon 6240 (18 cores, used from China via Ebay).

balnazzar · October 1, 2020, 7:43am

And it’s only 16 lanes. But 8x/8x is OK for 2 gpus, even at gen3.
I’d be a bit unhappy with just 2 memory channels though.

tcapelle · October 1, 2020, 7:49am

It’s a budget workstation. It has PCI4 so 8x is more than enough.
I am pretty sure it is 16x + 8x.

balnazzar · October 1, 2020, 7:49am

Good idea. I have four 2060 super, but the CPU/mobo complex is currently being replaced (old stuff sold, new stuff ordered).
I’ll get back to you once I have the machine operational again.

init_27 · October 1, 2020, 7:53am

Thanks for the pointers @balnazzar I think I’ll go with a 3960x only if the price diff <300$, otherwise I don’t see any harms with saving $$$ and getting “just 12 cores”, given the cons you’ve laid out.

Love the test bench idea and would be more than interested in contributing once I get my hands on the GPU

balnazzar · October 1, 2020, 7:53am

They have a total of 24 lanes, indeed. But 8 of them are usually allocated for a couple of m.2 slots.

tcapelle · October 1, 2020, 8:00am

Ah ok, the x570 chipset supports 40 lanes, but the CPU only 24.
I have not seen any slowdown from using both GPU simultaneously.
What benchs should we put?

A resnet50 training
A Transformer model
A recurent model like AWD-LSTM

balnazzar · October 1, 2020, 8:06am

Good. Just for a change with respect to the usual convnet-only benchmarks.

init_27 · October 1, 2020, 8:07am

I’d love to see a x8/x8 benchmark too.

Maybe there would be a way to software limit a GPU to x8-lanes? I’d give a try then

tcapelle · October 1, 2020, 8:15am

we could reuse the original train_imagenette.py scripts
I would start with:

train imagenette with
- bs=64, fp16 False,
- bs=128, fp16 True,
train tabular with
- bs=64, fp16 False,
- bs=128, fp16 True,
train imdb with
- bs=64, fp16 False,
- bs=128, fp16 True,

We could use a wandb callback to log all results!

Ok, I started a very simple repo here: https://github.com/tcapelle/fastbench
the script is bench.py, how should we log the info, and what info exactly?

init_27 · October 1, 2020, 10:16am

Hi @balnazzar, my money rests on your advice

I’ve decided to go with Team Blue!

Should I go with the 10900x or 10940x?
10 cores Vs 14 Cores, the price diff in India ~400$ but I was just worried if 10 cores might be less, assuming in the future I come up with some requirements.

What do you think?