Making your own server

(sravya8) #162

Thank you! :slight_smile: It costed me about $2050. I could have probably saved 200-300 if I was on stricter budget.

Sorry, I had a paper parts list which I can’t seem to find now.

(sravya8) #163

Thanks Layla! :slight_smile: Interesting question. It has been less than a month since I have this setup, but will keep eye on hikes in the bill and let you know.

(Christina Young) #164

@Pomo I haven’t looked at this thread in a while – I see you set up a 2nd partition with Linux, so do you still need instructions on how to do it with Windows? Let me know, I can post some information. I got it working with Windows 7.

I gave up on Theano though - and use Tensorflow instead. Theano requires some obsolete version of visual C++ and I wasn’t able to get it to run. With Tensorflow you just need to make sure you use the right dimension ordering.

Let me know… if anyone is interested I will post how I did it.

I was thinking about building a new, dedicated deep learning rig with Ubuntu and a GTX1080i, but I decided to get one of those new Nvidia Jetson TX2’s instead, since I am more interested in embedded applications anyway (being an old embedded C engineer). :smirk: I am looking forward to getting it this week!

In the meantime, I think I will run Turbotax and do my taxes while my CNN is training in the background… :grimacing:

(Malcolm McLean) #165

Hi Christina. I admire your skill and persistence. Thanks very much for your offer to help. However, I bailed out from Windows when I saw that you have to install an obsolete VS, and adjust a bunch of settings. What could possibly go wrong?

Just last night, I got my Ubuntu setup fully working, after days of experimentation and googling. Not the least of which was acquiring basic skills that are assumed by certain instructions, like “login to the TTY”, and “install git”. I’ll write up a “Beginner’s Guide to Setting Up Ubuntu…” during the next few days, in hopes of saving the next person time and frustration.

(Christina Young) #166

@Pomo – it really wasn’t all that hard to get it running on Windows 7 natively… as long as you are okay going with Tensorflow and Python 3.5. There are a few corrections for this I had to make to the and class notebooks, but nothing major. Of course, I couldn’t do the Theano RNN examples in lessons 6/7 – but I played around with those up on my AWS instance instead.

When I mean “natively”, I mean NO cygwin… straight out of a DOS prompt! I can run Jupyter Notebook, Python and even Spyder.

If anyone else ever needs to do this, just ping me on this thread @Christina and I will post the instructions.

(Mariya) #167

I’ve secured a Titan X Pascal and plan to get another one for a total of 2 GPUs. Has anyone built a box around the Pascal who can share their hardware components and configuration steps? Not a hardware person or a professional engineer so am preparing for a long, painful slog to get this box built :frowning:

Also - do you generally want multiple GPUs in a single box to be of the same model to simplify configuration or does it not matter if you have one Pascal and one 1080 TI?

cc: @davecg @Matthew @jeremy

(David Gutman) #168

Think it’s more of an issue for SLI (which doesn’t help with CUDA anyway). Jealous of your GPUs… Was that through the NVIDIA grant?

(Mariya) #169

Seems like, per your point re: CUDA, that the SLI is mostly for boosting graphical performance. Do you know if there’s an impact for deep learning applications? Otherwise, seems like it’s overly complicated to go for SLI since it requires a specialized motherboard and other hardware:

Yes - got two professor friends to apply for the grant. Secured one. Waiting to hear back on the other one, but feeling hopeful :smiley:

(Christopher) #170

For SLI (Gaming) the model matters, they need to be the same. For machine learning, no it does not.

As for the box, you really have 3 options, and it depends on what you want to do:

You can go a cheaper option like AMD or a smaller processor (I highly recommend away from AMD, it is cheaper for the chip, but the rest of the components are regular price so the savings on a high end ~$2000+ system is only like $100-$150 and Intel is almost always faster. Going a smaller CPU will slow down your EPOCHs a bit and bottleneck (limit peak performance) the GPU but the CPU isn’t a huge factor (think 5-20%), GPU is king. If the server is only going to be used as a headless (no monitor, not interactive) Deep Learning box, then this is fine, but I still recommend against this personally. Just like using AMD, the difference between a mid level and top level CPU isn’t big when you start factoring in $400-$1200 GPUs (think $100-200 tops).

The second option is what I have, I use my desktop heavily for all sorts of things from AAA gaming, game development, photography and deep learning. So I went with the best Intel I can get for raw single core performance (Intel 7700K). I only have a 1070 but I am considering two 1080Ti’s really soon when the MSI comes out with their gaming X models (faster stock speeds and infinitely better cooling, the cooling is important for gaming but absolutely critical for machine learning and I think this is underestimated as the stock cooling is awful).

The third option is a true server build, this involves using Intel Xeon processors rather than the traditional desktop processors. The benefit of this is you can break out of the traditional 4 core chips and go into 8 and 10 core as well as multiple CPU for a total of 40 cores+ with hyperthreading. You also get 40 PCI Express lanes, which allow you to run your GPUs without any bottlenecks. Even on a desktop 1-2 GPUs with newer boards (PCI Express 3) won’t be throttled until you drop down to 4x which is typically when you have 3+ GPUs (single card runs at 16x, two cards at 8x, and 3 cards 8x 8x and 4x). Each NVMe drive also takes away 4x so if you run 3GPUs you will sacrifice the speed of two GPUs to use it (8x 4x 4x and 4x for NVMe).

Basically, the way I see it, I would not opt for option one (cheaper CPU/components and focus on high-end GPUs, I’d just get a high-end GPU, you already at the thousands, another $150 to get the best CPU isn’t go into making a big difference in cost). If you are using it for interactive work or your main desktop, even more of a reason to do this. The only time I would recommend AMD or cheaping out on the CPU is if you are going low budget $500-$700 machine, cause it will allow you to get a super cheap box, but once you start getting up over $1000 I would recommend just getting the best CPU available.

If you want a dedicated server, you plan on going up to 4 GPUs and want the best setup you can get, and have a “power-house” deep learning box, I recommend the Xeon route. Xeon’s are better overall but are poor choices for Interactive desktops and Game machines as the CPUs are slower for 1 thread processes (which most work is) but have a lot more cores but more important twice the PCI Express buses. If you do not plan on 3 or more GPUs, this isn’t the best option. If you plan on 1070’s you can probably get 3 of them at full speed without even going Xeon with the 40 lanes because it just isn’t fast enough to bottleneck heavily even at 4x. If you are going to share with other people, or plan to do a lot of jobs in parallel, I would lean towards Xeon as well.

tl;dr Unless you building the cheapest system you can possibly build, don’t cheap out on the CPU and buy AMD or lower tier. The price difference is tiny and performance significant. If you build a multi-user server or plan on going 3+ GPUs, go Xeon unless it is your interactive desktop then you might want to stick with Intel 7700K or Intel Extreme.

(Christopher) #171

CUDA (the nVidia technology for deep learning) cannot use SLI, SLI is purely for gaming only. That being said, nVidia is working on an equivalent “SLI” for deep learning.

Just to clarify what SLI is, it is very simple. GPU’s plug into the motherboard’s PCI Express bus, those long slots they click into. This is what we refer to when we say (x8, 20 lanes, 4x and so on, this is how many lanes off the PCI Express bus).

SLI is nVidia’s way to allow to GPUs to work together without communicating over the PCI Express bus, it is simply a small cable that sits on top of the GPUs to connect them directly. This allows higher speed communication between two GPUs and keeps the CPU out of the processing. Because in gaming all the processing can be split between two cards but in the end, it still has one output (the port the monitor is plugged into) so in the end, there is always a master that does the final work.

For deep learning, this doesn’t work, but the theory is very powerful and can make huge changes to the deep learning industry if they come out with SLI for Deep Learning (I think it is called NVL or something, it has a name and they are working on it). The reason this is such a big deal is you can have 4 Titan X cards that would look like 1 giant card of the system and instead of 10-70% performance gain per card with deep learning, in theory, you can get near 100% like SLI does for gaming. This would be a huge improvement, not only would tensorflow not need to split the job for multiple GPUs, it would scale linearly with each card you.

tl;dr SLI bypasses PCI Express to allow 2 or more GPUs to talk directly but only works for gaming. nVidia is coming out with “SLI for Deep Learning” and dramatically improve multiGPU scaling to nearly 100% gain per GPU.

(Michael Guia) #172

Definitely post it!


I’ve got a question for you guys, if you could help me.

I want to follow this course, and I would like to take advantage of the GTX 970 that I already have and also to the fact that Tensorflow is officially supported in Windows now. Has anyone set the whole thing up for Windows? I mean, can this course be followed when not using AWS? In that case, which other things (apart from CUDA drivers) should I need to setup, and how could I do it?

Thanks in advance!

(Michael Guia) #174

Hi @Estiui and welcome to the community!

It’s an honor to be the one to respond to your very first post :smiley:

I’ll do my best to be open and honest in my feedback! I know the classic Unix vs Windows debate can get hot and heavy sometimes but let’s keep it civil ok

First of all, it seems like you have all the basic ingredients. Where it becomes complicated is in differences in the Windows OS kernel. Compared to Unix-based systems, it’s not all apples to apples under the hood so YMMV.

If you want to get into the specifics, a quick rundown would look something like:

  • Installing your GPU drivers
  • Installing the CUDA toolkit
  • Configuring the path variables
  • Installing CuDNN (this was tough on Mac last I remembered)
  • Installing Anaconda
  • Changing the conda.conf file to allow pip
  • pip installing tensorflow with this script
pip install --ignore-installed --upgrade
  • then just pip installing whatever’s needed (in my case, a hot cup of chai was needed as well!)

As a bonus, here’s how to remotely configured Jupyter.

Good luck!


Thank you very much Michael, I’ll try to follow those steps!

(Michael Guia) #176

Fair warning, some users recommend using Ubuntu. Just food for thought…

Perhaps @dradientgescent can clarify on this

(Jeremy Howard) #177

Just to clarify: “Pascal” refers to a whole generation of Nvidia GPUs, which includes the 1080 series and the most recent Titan X.

And it’s fine to have different cards, although if they have very different memory amounts it can get confusing.

(Christopher) #178

970 will be able to do a lot of part I but will need some adjusting for CNN as it has low memory and some of the memory is crippled.

All of part 1 works on Windows but part II does not as tensorflow for 3.6 isn’t available for Windows yet and Windows in general causes a lot of headaches with most of Jeremy’s notebooks. Many of them require certain functionality in Linux when doing file operations and some of the libraries are a pain in the butt on Window.

I highly recommend avoiding Windows, it just adds far too many headaches but you can get away its little issues in part I just not part II.

Jupyter is much more responsive on Linux as well, training times are very similar but Jupyter is laggy on Windows doing things like opening notebooks and processing stuff.


First of all, thank you for your answer.

As usual, it seems that Windows is the least recommended option for science…! So, let me redo my initial question. If I wanted to avoid Windows, could I make an Ubuntu installation under VirtualBox work with my GPU (the host system being Windows 10)? If that was not possible, could I follow the course without GPU in a VirtualBox VM with Ubuntu and with the Tensorflow CPU-only installation?

What I want, in the end, is to be able to set up and use my own GPU, as I already have it. My second option would be to know if it is feasible to follow the course with a CPU-only installation, as I’d like to avoid AWS, which would be my last option.

Thanks again for your help!

(Christopher) #180

If you are doing part I then windows is usable. It isn’t ideal as you need to change the notebooks a little when it comes to operating system commands via the % symbol.

Part II is not possible with Windows as even more file system issues and missing core
Libraries like tensorflow and pytorch for 3.6.

You can use virtual box but using Ubuntu bash for windows would be better. Although it has some limitations and you can’t use gpu.

Windows cannot pass gpu to Linux but you can pass gpu from Linux to Windows.

The best option if you do not do a dedicated box is to dual boot.

CPU is going to be really slow, costing nearly hundred
hours or more of wasted time.

You can look at Floyd, you get enough free hours to finish the course but it is extremely difficult to use and frustrating. AWS is better choice but will cost a lot more but can be mitigated by using spot images.


Hmmm I see… well, then I’m going to try to install Ubuntu alongside Windows, I should have done it years ago. Thanks for all the help!