Thanks, that’s great description. I have place for 2 gpus only which i already used. Was looking for solutions of how to connect more with some splitters but than gave up and thought 2 is enough for now.
2 1080s, not enough RAM for some lessons.
Hows the 2080 compared to the 1080? I have a 1080ti and am thinking about adding another gpu. Any recommendations?
How much would be enough?
Is there any post with hardware requirements for the V3 course, other than this one?
I’m asking because I want to build a PC, which I would like to use for the course. Thanks in advance
Not sure, only @jeremy knows for sure, in the last class he was using a 1080Ti that had 12GB of RAM, this time there are faster and better cards available. If I were you, I’d use one of the paperspace or other options, I built my machine in December 2017, and it cost me almost $4000 US. I don’t know what it would cost now. If you have a lot of money, I’d go for the latest 2080Ti.
Anything is fine as long as pytorch v1 supports it. I still have 12GB GPU RAM. You can always reduce batch size.
If you’re aiming for a cheaper GPU, I’ve a 8 gig-GTX 1070 based laptop, It does the job for me and I’m sure for Part 1 the training time won’t be verrry long. For Part2 and a more future looking machine, you might want a better GPU.
Hi @matdmiller - interesting post. I have this motherboard: Asus WS X299 Pro ATX with a single 1080ti.
Do you know if adding a second GPU is okay with this set-up e.g. another 1080ti or maybe a 2080?
That motherboard is capable of driving multiple GPUs. I would need more info on your specific build to help try and answer your question with a good degree of confidence. If you post a detailed list of components (pc part picker) than I could evaluate it against what I know. I’ve only ever built 2 machines so I’m by no means an expert but I have learned a lot from building my deep learning rig. I would put your build into PC Part Picker as it helps catch some issues with incompatible parts.
Some things I would look at are:
- You need a minimum of 8 pcie lanes per card. You can check how many pcie lanes your CPU has by googling your cpu and clicking on the ark.intel.com link and looking under “Max # of PCI Express Lanes”. Most medium to high end CPUs have at least 28 lanes so you should be fine. If you want to get every last ounce of performance you’d want 16 lanes available per card which means at least 40 lanes available on your CPU. I have seen discussions on how much of a difference 8x vs 16x lanes make for deep learning but I haven’t seen any numbers showing it makes a huge difference. Your motherboard chipset and M.2 will probably take 4 lanes each so keep that in mind. In the back of your motherboard manual it has block diagrams on how your PCI express lanes are divvied out. (https://dlcdnets.asus.com/pub/ASUS/mb/Socket2066/WS_X299_PRO/Manual/E14486_WS_X299_PRO_Series_UM_V2_WEB.pdf)
- Look at how your graphics cards are cooled and how big/tall they are. Most higher performance consumer grade GPU’s take up 2 slots, but some take up more than that. Some motherboards have a larger spacing between their GPU slots. Yours has a pretty common 1 slot in between your GPU slots so if your top card is larger than 2 slots, you won’t be able to fit another one in slot 2. You do have another lower PCIe x16 slot but I’m not sure if your motherboard would like having a second GPU there instead of slot 2. I don’t know how much it actually matters but it’s not what most motherboard manuals recommend. The other thing you need to look out for is cooling on your GPU’s. For example I have 1 water cooled 1080ti and 2 air cooled 1080ti FE cards. The water cooled one has a fan on the bottom but I’ve never actually seen it run. The 1080ti FE cards have a fan at the bottom right which will be running at high speed when you are performing training. I originally had my 1080ti FE cards right next to each other and the top one was getting VERY hot because it didn’t have enough spacing by the fan to get good airflow. It may work for some people having 2 cards back to back like that but my card was hitting over 90C and i believe it was throttling. I have a 1050 card between my 1080ti FE cards. It is a short card and doesn’t block the fan on 1080ti card that is above it. I only use my 1050 for driving my monitor so I don’t chew up valuable memory from my primary 1080ti cards that I use for DL. The 1050 never gets taxed so it doesn’t matter that its fan port is blocked.
- Before moving my cards to their current configuration, i had a 1080ti that sat over my M.2 slot. I was getting high temperature warnings on the M.2 drive. Your motherboard has a M.2 heatsink so it may not be a problem for you. I moved my 1050 into a slot where it wouldn’t cover my M.2 drive and the warning went away. Depending on which M.2 slot you use, it will be covered if you have a GPU in slot 2.
- Is your power supply sufficient to handle 2 cards and does it have enough power cords to power 2 GPUs. I have a 1,000W PSU because I built my rig during the Crypto Boom and it was the largest one that I could get without waiting for a backorder which was quoted as weeks/months lead time. I wanted at least 1,200W but I couldn’t get one. I think my PSU is borderline large enough and is something you should check.
Something to note - My water cooled GPU in slot 1 blocks some of the ports on my motherboard that are right above it. This makes it so I can’t have all of the jumpers in my case connected so not all of the ports on my case work. This was annoying but thankfully not a show stopper for me. I also have to remove my GPU in slot 1 if I want to add/remove RAM because the RAM clips wont open when the GPU is installed. This is also annoying but not a show stopper.
I loved building my DL rig and love having it. I spent a lot of time researching everything to try and make sure it would work when all of the parts arrived. I started off using AWS P2 instances and having my own rig w/ 1080ti’s substantially faster and I no longer have $100-300/mo AWS bills.
Hopefully you and others find this information useful!
Firstly, this is a wonderful response - thank you so much for taking the time to reply in this manner, it’s really helpful. I’m not sure I can do your post justice given you’ve mentioned many things that are the moment are way over my head so perhaps I can just ask you a few follow up questions before diving into any more details.
What was your experience before building your own machines and do you think it’s reasonable for someone who has never done this to do on their own by self-teaching? I do not know a lot of the specifics of the things you’ve mentioned though I luckily have an old PC I could experiment on taking apart and re-building as practice but clearly the big risk is that I mess something up with my current set-up, or worse, break something. I’d prefer as realistic/honest an appraisal as you can give.
Quick point on temperatures: this is basically the reason I’m asking as I want to add a second GPU but my current GPU already hits 84C (the max) and so I think I’m going to need to upgrade my case and cooling system. It was an oversight on my behalf as the PC is only 6 months old I just didn’t (naively) give this part much thought.
Regarding my spec, it’s from pcspecialist and these are perhaps the relevant bits:
Case CORSAIR CARBIDE SERIES™ 200R COMPACT GAMING CASE
Processor (CPU) Intel Core™ i7 Eight Core Processor i7-7820X (3.6GHz) 11MB Cache
Motherboard ASUS WS X299 PRO: ATX, USB 3.1, SATA 6 GB/s
Memory (RAM) 64GB Corsair VENGEANCE DDR4 3000MHz (4 x 16GB)
Graphics Card 11GB NVIDIA GEFORCE GTX 1080 Ti
Power Supply CORSAIR 1000W RMx SERIES™ MODULAR 80 PLUS® GOLD, ULTRA QUIET
Processor Cooling INTEL SOCKET 2011/2066 STANDARD CPU COOLER
Thermal Paste STANDARD THERMAL PASTE FOR SUFFICIENT COOLING
The case is the main limitation I think as it’s quite small which I think is causing the temperature issues. The CPU appears to have 28 PCI express lanes and I did get 1000W power supply with one eye on another GPU at some point. When I click on pcspecialist to upgrade my order (they sadly don’t do case upgrades) and go to add a 2080 ti I don’t get a warning for the power supply compatibility with a 1000W (it estimates I’d need 886W including a 20% buffer). Though it does say “please note that you cannot enable Nvidia® SLi or AMD® Crossfire™” if you have two different GPUs - not sure what this means.
So given pcspecialist don’t do case upgrades I need to either do it myself or find a pc shop I’m happy know enough about my set-up and cooling to carry out the upgrade - ideally not costing the earth.
Would love to hear your thoughts,
Here’s the results of the poll:
I think what you’re wanting is:
- Solve cooling problem / transfer the “guts” of your current computer to a new case to fix cooling
- Add a second GPU
- Can you do this yourself
Let me know if this is not the case.
Your current case certainly isn’t the best one I’ve seen for airflow but I’d be a little surprised if you couldn’t get it working to acceptable levels. How many case fans do you have and which way are each of them blowing? Linus Tech Tips on youtube has several videos on case fan testing. I would start with watching those. I would have expected a PC builder to set this up right but maybe they didn’t. Are there more case fan mounting slots that you could add more fans? I couldn’t find what was included by default on your current case.
As for mounting multiple GPU’s, your motherboard can handle multiple GPU’s. Ideally you want space by where your GPU fans are to allow for proper airflow. I have seen plenty of builds where they have blocked the GPU fans but I when I did that with my GPU’s, i was not happy with the temperature results. Which 1080ti do you have? There are many variants from different manufacturers. I need more information to help answer this.
As for the question: Can you do this upgrade yourself? There are a ton of great online resources about building your own machine. Youtube would be a good resource. I personally like Linus Tech Tips on youtube and he has several building a pc guides. I believe you already have your PC built so it should be much easier to know how it goes together as you’ll have to take it apart first. I found it pretty straight forward to build my own PC after watching Youtube videos. If you’re not comfortable with doing it yourself than I would expect there is probably a local PC shop that could do the work for you at a reasonable cost. A lot of PC gamers build their own PC. If you have a gamer friend that has built their own computer you could ask them to help you out. I probably dedicated about 20-30 hours of my time building my box from research on what parts to buy, learning how to do the build and then actually building it and getting all of the software set up. Building a PC is not something you’ll be able to do in an afternoon if you’ve never done it before. I started off with a basic understanding of what all of the different components function was and how they worked together. My total experience with building PC’s prior to building my own rig for fast.ai was watching a friend build one over 10 years ago. I believe it is certainly something you can do yourself as long as you are committed to spending the time learning how to do it properly.
A few notes if you decide on doing this yourself:
- Pay attention to cable management. It does not need to be perfect by any means but if you put some thought and effort into it, it will help with cooling and maintainability.
- Pay attention to mounting your CPU to your motherboard and cooler to your CPU if you end up having to take that apart. I don’t think you should have to take that apart to get your motherboard out of your case, but if you do that is probably the most delicate thing you’ll have to do. Once you’ve watched enough guides on how to do it, it’s pretty straight forward. If you don’t do it right and don’t handle the parts correctly it wouldn’t be hard to permanently damage them.
- Sometimes you have to push a lot harder than you’d expect to get connectors, ram and video cards plugged in. If it doesn’t go in easily, triple check that you’re doing it right and then just push harder. Most things are designed so they can only be installed one way. For example, there is a notch in your ram that won’t let you install it backwards. That’s what I’m talking about when I say “triple check” you’re doing it right.
- Note the order in which the guides tell you to install things. This will make your life easier.
- Read the instructions. You probably don’t have the instructions readily available as your PC was built by someone else, but you should be able to find them online pretty easily from the manufacturer with a little bit of Googling. Some people prefer to jump in to things without reading instructions, but I would not recommend that with building a computer for your first time.
Don’t worry about the fact that your GPU’s are different and are not SLI compatible. This is only needed for gaming. If you had 2x2080ti’s you could set up NVLink between them which I read helps with training on multiple GPU’s simultaneously, but you will be just fine without that.
Here are some helpful links that I would suggest you read:
Hopefully that helps!
My organization is planning to build a deep learning rig as its more cost efficient as compared to already available builds in market like 1 and 2 i have suggested them the Gold/Silver/Bronze buckets. Need To know all of your views.After looking at the comments i am thinking to add 2080ti in my gold plan.Just little bit skeptical about the compatibility issues with the components. @init_27, @antorsae , @matdmiller need to know all of your views.
If anyone can post the pcpartpicker link of a multi GPU setup that they have build out then it would be helpful. I am bounded with a budget constraint of 5000$-6000$
This is the Pcpartpicker link : https:/pcpartpicker.com/list/fJkTtg
Some comments on your lists:
- Jeremy has suggested using Intel CPU’s because of deep learning library optimizations for Intel CPU’s.
- 1,600W power supply is overkill for 2 GPU’s. 1,000W is enough
- I’ve talked a lot about GPU airflow in other posts and your build does not address the issues of blocking the fans. Please read through those posts. FE cards get a lot hotter if the fans are blocked.
- I would get a full tower case as it will be easier to work in and probably no more expensive. Make sure it has plenty of fans that come with or at least has the ability for you to add more fans if it doesn’t come with many.
- Make sure if you are using water cooling / radiators that your case can fit them.
Here is a link to my build: https://pcpartpicker.com/list/N6mM6s
This is what I would do differently:
- I would have gone for an i9 CPU and corresponding i9 compatible motherboard as I am bottlenecked at times with my current CPU when working with image data on multiple GPU’s.
- The case I picked functionally works well but it is gigantic and I’m sure there are other more compact cases that would be better. It works well, it just takes up a lot of room.
- 3 GPU’s instead of 4. I haven’t run into a ton of cases, with the exception of some kaggle competitions where I use my 3rd 1080ti. If you are going to use this with a monitor I would get a low powered GPU to drive it. If you are going to use this as a server, then it’s not necessary.
- Bigger NVME drive. I have hard drives for my cold storage, but it takes some maintenance to move data around and is something else for me to do. If I had a bigger NVME drive i wouldn’t do this as much. If you aren’t working with big image datasets than your drive size will likely not matter nearly as much.
- Only 1 HDD. I was originally running my machine in Unraid and I decided to move away from that and I don’t really need the second HDD now.
- I would definitely get a 2080ti, but it was not available when I built my machine.
Can you point me where Jeremy recommended Intel CPUs? This may be the case when training on CPU, but if you’re using GPU for training I see it was a moot point.
From personal experience I can vouch for the Threadripper (I have the 16-core/32-threads one) and it becomes very handy when doing a lot of on-the-fly augmentations, second that with 64 PCIe lanes and 128 Gb ram max.
If you’re not going the Threadripper route many intel mobos as well as AMD ones (provided you have one processor ending in -G) have support for on-board graphics in which case you can use the onboard card for display; although personally I run mine w/ 2 x 2080 Tis and when Im not using it I do
sudo service lightdm stop which will kill the graphic desktop and free all GPU memory for deeplearning; but even if you don’t do this I can still watch 4K movies with that computer while it’s training on both GPUs.
Thanks. Only comment is that if your workload is multi-thread friendly you’d likely find multiple core/threads provide much gains vs. what optimization would do (for DL or scientific programming I’d go with more cores rather than X% faster single-core performance).
We’ve often found CPU performance the restriction on model training time, due to data augmentation and jpeg decoding. With tensor cores this is quite a big issue.