Build your deep learning box: wiki thread

@Moondra I would recommend using a single 2080Ti and not 2xGPU to avoid the hassles of cooling. You’ll have to configure all of the complications of maybe liquid cooling both of them.

I’m not against that, but it seemed to be pretty intimidating to me, so I decided to stick to a single GPU.

Also-I’m not sure of the current scenario but Intel offers some speedup because of the MKL scernios so it might be a better pick than AMD (please verify)

For the Mobo-make sure each GPU gets plenty PCIE lines. Note: M.2 drives use PCIE lines, also you would want to make sure that you’ll have enough RAM for your future needs (64 or 128GB)

Regards,
Sanyam.

@init_27 Thanks for the advice. I have been looking at the complications of cooling and it seems I need to get the cards which have a special type of fan attached. I think it was called “blower fans”

I will look into the MKL related speedups.

As for the PCIE lines, I am looking into that as well.
I willl most likely go with multiple GPUs and let you know how it goes with cooling.

Thank you so much for your advice.

1 Like

In the first lecture of Part 2 last year (2018), Jeremy gave these suggestions for choosing a GPU:

  • GTX 1080ti is the price performance sweet spot for today (Mar 2018)
  • GTX 1070 8GB is also fine
  • Used cards are fine
  • Dual GPUs let you train while experimenting

I also read Tim Dettmers’ blog on picking a GPU (updated Nov 2018), and the tl;dr version was:

  • Best GPU overall : RTX 2070
  • GPUs to avoid : Any Tesla card; any Quadro card; any Founders Edition card; Titan V, Titan XP
  • Cost-efficient but expensive : RTX 2070
  • Cost-efficient and cheap : GTX Titan (Pascal) from eBay, GTX 1060 (6GB), GTX 1050 Ti (4GB)
  • I have little money : GTX Titan (Pascal) from eBay, or GTX 1060 (6GB), or GTX 1050 Ti (4GB)
  • I have almost no money : GTX 1050 Ti (4GB); CPU (prototyping) + AWS/TPU (training); or Colab.

Although additionally from the performance per dollar graph, the top three are the RTX 2070, GTX 1060, and GTX 1070.

The common factor seems to be that the GTX 1070 is a good option, but it’s surprising to me that this would still be the case a year later. Is there a better choice today to balance performance and price?

Thanks for any suggestions! :grinning:

Update: buried in the comments on Tim Dettmers’ blog is a recommendation for the RTX 2060 as a [relatively] budget option right now.

1 Like

Thanks for this post!

Here is also the link for the updated blog post of Tim Dettmer.
It seems if you can afford it the RTX 2070 is the way to go.

I am also currently thinking about getting my own DL machine. I wonder if I have to be careful with Windows Gaming components as I maybe run into troubles with drivers or is this not a issue at all?
I am thinking of getting a Windows Gaming PC and use it with Ubuntu for DL. The setup which I currently have in mind is: Nvidia GeForce RTX 2070 8GB GPU, ASUS TUF Z390-Pro Gaming motherboard, Intel Core i5 9600K CPU, 16GB RAM.

Do you think this is a good option?
I am happy for suggestions, links, etc.!

Windows / Ubuntu dual boot not a problem either can use totally separate drives and select boot drive from bios on startup or have Ubuntu grub loader set boot order win/Ubuntu. With second option can either install windows or Ubuntu first, can’t remember which worked better for me. I just use first option and select from bios.

1 Like

I would suggest to start right away with 32GB RAM.

1 Like

Does anybody tried liquid cooling? I built my own Titan RTX PC and they get crazy hot pretty fast.

Check this 4 GPU liquid cooled beast from BIZON TECH! Looks amazing!


4 x NVIDIA TITAN RTX for deep learning:

1 Like

It would be interesting to understand what they exactly used for cooling the four cards. It seems that they chained all the waterblocks.

I’d suggest 64GB if RAM if it fits in your box, last year while doing GAN’s it really helped me.

@bushaev
If you are going to spend money to build a rig, I would recommend against trying to do portable. When you are training, you will find that your battery is your new constraint.

If you are cash constrained (I was), I put together a $235 box and was able to run all the exercises and all the architectures for lessons 1 and 2 (2018) on this box. I got a refurbished Dell i5 desktop for $150 and added a GT 1030 ($85). I bought a KVM so I could share my monitor, keyboard and mouse for about $120, but you could probably buy a keyboard mouse and monitor for less. You don’t really need a high end monitor to run the lessons.

In another thread I saw someone say that this kind of rig is only for people who are not serious abut DL. I beg to differ. I’m a dinosaur and needed to do a lot of catching up. It was an excellent choice. Installing fastai in my own environment was really helpful for later troubleshooting. Having only 2GB memory forced me to play around with the parameters (batch size was the key) in the lessons.

In the breeds exercise, just before the learn.save(‘224_pre’), I was able to get a (slightly) better intermediate result for validation loss and accuracy, than the one shown in the lesson:

Lesson shows: array([0.25087]), 0.918786694858872]
my result: 0.24380622941220575, 0.9212328767123288

In reading all the entries in this wiki, and didn’t see anyone say that every time you change your hardware, it’s time to update your recovery plan. TAKE A BACKUP! KEEP IT CURRENT! I’m sure @jeremy will endorse this advice. This isn’t an issue if you are using colab or paperspace or AWS, but for Win10 anyway, it could save you a lot of time.

Despite the gap in the calendar, I’m including the following:

@tensoralex
Both my CPU and GPU go from room temp (about 28 C) to 79-80 C when I’m running a learning step. I know I’m under powered, but could you be overpowered? :laughing:

@surmenok
I took a chance and got a card that calls for a somewhat bigger power supply than my box has. So far I haven’t melted the motherboard.

@chunduri
Not sure what the exchange rate is, but I think my rig is about as cheap as can me made and still work the lessons. Unfortunately it isn’t upgradable. Really, the question is if you want to learn the fastai stuff, stick with learning the fastai stuff. That way you really know when you have found a hardware constraint.

#QinLu
I can assure you that having only 2GB can be painful.

@pattyhendrix
I’m working with less, but your approach is upgradable.

@silentq
It really depends on whether you want to learn fastai or want to build a powerful rig and have the money and time to do so. Not an expert on the cpu side, but my rig shows cpu 100
% busy when the learn epochs start.

@tank13
I have almost no money. I have a GT 1030 with 2GB. Since I got it squared away, I haven’t had to resort to colab, paperspace or AWS. Only drawback I’ve found is that sometimes I have to go drink a cold one or two, or leave it running overnight to complete a learn.

Sorry for going on so long. I’m still catching up.

1 Like

That’s good that the power supply works for you. If power supply is not powerful enough, your machine will just reboot on power consumption spikes, as it was in my case. I’ve read that low quality power supply can lead to melted pins on some devices in case there is insufficient power surge protection in the PSU unit.

Hey! No one wants a melted motherboard. I don’t have a way to monitor power consumption. Wimpy is a dedicated box. I don’t run much else on it and definitely nothing else when I’m running the course work. When I’m running a learn cycle the temp of the cpu and gpu both go up to about 80C. When the box is idle they both read about 30. From what I can observe, it seems like a smooth process, no spikes. Wimpy is kind of slow, so the processors can spend hours at that high temp.

I tried with a couple of machines: https://medium.com/@bingobee01/watercooling-a-deep-learning-machine-46608f6acfee

I put a premix solution in one machine and seemed to have reacted with either silver kill coil or trace of another additive, created high pressure that caused leaks. Fortunately non destructive. Gone back to air for now till have 4 GPUs I need to cool (and won’t use a premix again)

I am considering cooling two 1080ti with AIOs (one of them gets above 90C). What you report makes me worry, though. What do you mean exactly by “premix”?

Premix = a coloured solution with additives - eg Mayhems (which reacted with something in my loop).

Distilled water + a few drops of biocide + a silver kill coil should be fine.

I was running 2x GPU’s in a custom loop no problems (just distilled water+biocide) for a few months - until I rebuilt for a 3rd gpu and used the coloured additive…

The other thing you could try is High airflow fans at front of your case to get more air to GPU intakes- either Noctua Industrial PPC 2000rpm if need the static pressure (noisy at full speed), or the Noctua high airflow 1500rpm ones (I have found Noctua a good balance between performance and sound level).

1 Like

Hey,guys,
Put up a mock up for a new machine,what do you guys think?

I choose a 1600 watts PSU because ultimately I want to expand to 4 graphics cards.

Not really sure if the case has heat issues with 4 GPUs running.

Not so sure if the turbo version of RTX 2080ti is the ideal choice too, it has a blower-style fan though.----I saw some really bad reviews on amazon.

----I decided to change to Gigabyte turbo but pcpartpicker doesn’t have them on yet the time I made this list so I just made a filler.

PCPartPicker part list: https://pcpartpicker.com/list/t4HKjy
Price breakdown by merchant: https://pcpartpicker.com/list/t4HKjy/by_merchant/

CPU: AMD - Threadripper 1950X 3.4 GHz 16-Core Processor ($584.00 @ Amazon)
CPU Cooler: Noctua - NH-U14S TR4-SP3 82.52 CFM CPU Cooler ($79.90 @ Amazon)
Motherboard: ASRock - X399 Taichi ATX TR4 Motherboard ($279.89 @ OutletPC)
Memory: Corsair - Vengeance LPX 64 GB (4 x 16 GB) DDR4-3200 Memory ($399.99 @ Newegg)
Storage: Western Digital - SN750 1 TB M.2-2280 Solid State Drive ($229.99 @ Amazon)
Storage: Seagate - Barracuda 2 TB 3.5" 7200RPM Internal Hard Drive ($58.44 @ OutletPC)
Video Card: Gigabyte - GeForce RTX 2080 Ti 11 GB AORUS Video Card (2-Way SLI) ($1229.99 @ Newegg)
Video Card: Gigabyte - GeForce RTX 2080 Ti 11 GB AORUS Video Card (2-Way SLI) ($1229.99 @ Newegg)
Case: Corsair - 780T ATX Full Tower Case ($179.99 @ Amazon)
Power Supply: Corsair - 1600 W 80+ Titanium Certified Fully-Modular ATX Power Supply ($339.99 @ Amazon)
Mouse: Logitech - G502 HERO Wired Optical Mouse ($64.84 @ Amazon)
Total: $4677.01
Prices include shipping, taxes, and discounts when available
Generated by PCPartPicker 2019-04-13 06:48 EDT-0400

Besides more pci lanes any reason to go with threadripper? Most of the multi gpu build i have seen from people and companies ( Lambda, Bizon etc), use intel ones for multi gpu setup.
Also specially for 2080 ti multi setup go with blower cards if possible. They are heating up like crazy in multi setup

Building my own deep learning box for research purposes. It is mainly going to be 2 GPU and might add a third one in the future ( if two play well). Below is the part list that i came up with. For GPU - I needed evga as they are the only that provide international warranty and their blower style card for 2080 ti is still not out. Since i am not going for 4 gpus, I think i don’t has to have blower style ones,right? For CPU i didn’t want it to be too powerful. . Mobo - This one is hard and i am not sure i got the right one if i wanted to a third GPU down the road. Saw lots of 4 gpu builds with ASUS WS X299 SAGE, but settled with MARK 2 now.
how will be the performance for 2 2080ti ( or even 3 in future) for 7800x or should i go with 9820x for more pcie lanes ( and future resale value) which is almost 3 times the 7800x price .

PCPartPicker part list / Price breakdown by merchant

Type Item Price
CPU Intel - Core i7-7800X 3.5 GHz 6-Core Processor $398.89 @ B&H
CPU Cooler Corsair - H100i v2 70.69 CFM Liquid CPU Cooler $187.99 @ OutletPC
Motherboard Asus - TUF X299 MARK 2 ATX LGA2066 Motherboard $240.23 @ Amazon
Memory Corsair - Vengeance LPX 64 GB (4 x 16 GB) DDR4-3200 Memory -
Storage Samsung - 960 EVO 500 GB M.2-2280 Solid State Drive $169.89 @ OutletPC
Storage HP - EX920 1 TB M.2-2280 Solid State Drive $144.89 @ OutletPC
Video Card EVGA - GeForce RTX 2080 Ti 11 GB FTW3 ULTRA GAMING Video Card (2-Way SLI) $1479.00 @ Amazon
Video Card EVGA - GeForce RTX 2080 Ti 11 GB FTW3 ULTRA GAMING Video Card (2-Way SLI) $1479.00 @ Amazon
Case Corsair - Air 540 ATX Full Tower Case $115.53 @ Amazon
Power Supply Corsair - RMx (2018) 850 W 80+ Gold Certified Fully-Modular ATX Power Supply $119.99 @ Amazon
Wireless Network Adapter Asus - PCE-AC51 PCI-Express x1 802.11a/b/g/n/ac Wi-Fi Adapter $28.89 @ OutletPC
External Storage Seagate - Backup Plus Slim 2 TB External Hard Drive $64.99 @ Adorama
Prices include shipping, taxes, rebates, and discounts
Total (before mail-in rebates) $4504.29
Mail-in rebates -$75.00
Total $4429.29
Generated by PCPartPicker 2019-04-23 19:00 EDT-0400

I initially chose a 9820x, but changed my mind after reading about a guide from an MIT researcher, he replied in a reddit post that he wished he chose TR1950x.

##After testing the system myself, Threadripper still lags on openblas, so if your workloads has lots of data-augmentations, stay away from AMD for now.

Yes,Gigabyte turbo has a blower style fan. I took down the harddrive bracket for maximum airflow.

Actually, the guide from the MIT researcher is sublime!! 7000 dollars for a 4 GPU setup that costs half what a Lambda’s 4-GPU Workstation charges.

I’ll separate the link below.
http://l7.curtisnorthcutt.com/build-pro-deep-learning-workstation

1 Like

I was initially confused as to whether they replaced the 3xopen air GPUs with 3xblowers but it appears in final build they are using 3x open air GPUs and 1x blower style. Would be interesting to see temperatures.

On motherboards, another option is to buy a used X99 motherboard and used CPU to fit (eg i7 5820k or xeon e5 1/2nnn v4). Much cheaper then new x299 mb and i7/9 X series here where I am.