Opinionated guide to deep learning hardware


#1

tldr: The best hardware for deep learning is the one you already have. The second best is the most powerful GPU you can afford along with standard components to support it.

Why this guide

The conversation on what hardware to get is a topic of interest to many and there are many voices that contribute to the discussion.

Problem is the conversation can be quite misleading to those who are new to the field.

One is more likely to come across an opinion on hardware by someone who likes to discuss such things online than one coming from a seasoned ML pro. While the advice might be technically sound, it might not be useful in the context of building one’s first deep learning rig.

Here are a bunch of thoughts on this matter that I think can be helpful.

How much does it all cost?

The most valuable resource you have is time.

If you are just starting out at something, you will get the best bang for your buck… practicing said thing. That might mean following howtos, writing code and listening to an occasional lecture.

But every minute you spend researching hardware is a minute you could have spent doing something else.

Instead of picking a slightly better piece of RAM you could have for instance learned a new debugging trick. Both might take 30 minutes but the latter will be infinitely more useful.

But who am I kidding with 30 minutes. For many the search for a perfect parts combination will resemble a quest for the holy Grail, taking many, many hours over multiple weeks.

That is time irrevocably lost.

Ok, ok but I can only afford a previous gen GPU, what now?

The advice still stands — get the biggest GPU you can reasonably afford. If you are on a budget and really want to give DL a shot than your only options are figuring out where to get AWS or GCP credits (can be done with a bit of googling but using cloud instances has a steeper learning curve) or getting your own desktop PC. In the longer run getting your own hardware will most likely give you much better ROI than going the cloud instances route (even if you factor in available cloud credits).

Does it even matter what hardware I get?

When coming up with a build for your first DL PC you will lack the experience to evaluate options that are available to you. Training models uses hw in a way that is different to most of the loads most of us might have encountered.

The advice of getting the biggest single GPU you can afford will lead you to something close to the optimal rig you could build right now.

The crucial part is to avoid being intimidated by the hw conversation and to move to the part where you run your own ML experiments.

If you can accomplish these two things, everything else becomes unimportant.

Summary

If you are willing to invest time into DL, get a DL box quickly and put your time where it makes a difference.

If you really, really want to look at charts, there is a single place on the Internet that is worth your time and that is a blog by a researcher, Tim Dettmers.

Addendum

A list of stuff you certainly don’t need:

  • fractionally faster SSDs / NVmes (nvmes are fun but an avarge SSD will do)

  • ECC memory

  • RAM disks

  • 1000+ WATT powersupplies

  • motherboards and CPUs supporting some specific number of lanes

  • fast RAM

Things you might be happy you got

  • smaller SSD to store train data, bigger HDD for archiving