Making your own server

Thank you! :slight_smile: It costed me about $2050. I could have probably saved 200-300 if I was on stricter budget.

Sorry, I had a paper parts list which I canā€™t seem to find now.

Thanks Layla! :slight_smile: Interesting question. It has been less than a month since I have this setup, but will keep eye on hikes in the bill and let you know.

1 Like

@Pomo I havenā€™t looked at this thread in a while ā€“ I see you set up a 2nd partition with Linux, so do you still need instructions on how to do it with Windows? Let me know, I can post some information. I got it working with Windows 7.

I gave up on Theano though - and use Tensorflow instead. Theano requires some obsolete version of visual C++ and I wasnā€™t able to get it to run. With Tensorflow you just need to make sure you use the right dimension ordering.

Let me knowā€¦ if anyone is interested I will post how I did it.

I was thinking about building a new, dedicated deep learning rig with Ubuntu and a GTX1080i, but I decided to get one of those new Nvidia Jetson TX2ā€™s instead, since I am more interested in embedded applications anyway (being an old embedded C engineer). :smirk: I am looking forward to getting it this week!

In the meantime, I think I will run Turbotax and do my taxes while my CNN is training in the backgroundā€¦ :grimacing:

Hi Christina. I admire your skill and persistence. Thanks very much for your offer to help. However, I bailed out from Windows when I saw that you have to install an obsolete VS, and adjust a bunch of settings. What could possibly go wrong?

Just last night, I got my Ubuntu setup fully working, after days of experimentation and googling. Not the least of which was acquiring basic skills that are assumed by certain instructions, like ā€œlogin to the TTYā€, and ā€œinstall gitā€. Iā€™ll write up a ā€œBeginnerā€™s Guide to Setting Up Ubuntuā€¦ā€ during the next few days, in hopes of saving the next person time and frustration.

2 Likes

@Pomo ā€“ it really wasnā€™t all that hard to get it running on Windows 7 nativelyā€¦ as long as you are okay going with Tensorflow and Python 3.5. There are a few corrections for this I had to make to the utils.py and class notebooks, but nothing major. Of course, I couldnā€™t do the Theano RNN examples in lessons 6/7 ā€“ but I played around with those up on my AWS instance instead.

When I mean ā€œnativelyā€, I mean NO cygwinā€¦ straight out of a DOS prompt! I can run Jupyter Notebook, Python and even Spyder.

If anyone else ever needs to do this, just ping me on this thread @Christina and I will post the instructions.

1 Like

Iā€™ve secured a Titan X Pascal and plan to get another one for a total of 2 GPUs. Has anyone built a box around the Pascal who can share their hardware components and configuration steps? Not a hardware person or a professional engineer so am preparing for a long, painful slog to get this box built :frowning:

Also - do you generally want multiple GPUs in a single box to be of the same model to simplify configuration or does it not matter if you have one Pascal and one 1080 TI?

cc: @davecg @Matthew @jeremy

1 Like

Think itā€™s more of an issue for SLI (which doesnā€™t help with CUDA anyway). Jealous of your GPUsā€¦ Was that through the NVIDIA grant?

Seems like, per your point re: CUDA, that the SLI is mostly for boosting graphical performance. Do you know if thereā€™s an impact for deep learning applications? Otherwise, seems like itā€™s overly complicated to go for SLI since it requires a specialized motherboard and other hardware: http://www.geforce.com/hardware/technology/sli

Yes - got two professor friends to apply for the grant. Secured one. Waiting to hear back on the other one, but feeling hopeful :smiley:

For SLI (Gaming) the model matters, they need to be the same. For machine learning, no it does not.

As for the box, you really have 3 options, and it depends on what you want to do:

You can go a cheaper option like AMD or a smaller processor (I highly recommend away from AMD, it is cheaper for the chip, but the rest of the components are regular price so the savings on a high end ~$2000+ system is only like $100-$150 and Intel is almost always faster. Going a smaller CPU will slow down your EPOCHs a bit and bottleneck (limit peak performance) the GPU but the CPU isnā€™t a huge factor (think 5-20%), GPU is king. If the server is only going to be used as a headless (no monitor, not interactive) Deep Learning box, then this is fine, but I still recommend against this personally. Just like using AMD, the difference between a mid level and top level CPU isnā€™t big when you start factoring in $400-$1200 GPUs (think $100-200 tops).

The second option is what I have, I use my desktop heavily for all sorts of things from AAA gaming, game development, photography and deep learning. So I went with the best Intel I can get for raw single core performance (Intel 7700K). I only have a 1070 but I am considering two 1080Tiā€™s really soon when the MSI comes out with their gaming X models (faster stock speeds and infinitely better cooling, the cooling is important for gaming but absolutely critical for machine learning and I think this is underestimated as the stock cooling is awful).

The third option is a true server build, this involves using Intel Xeon processors rather than the traditional desktop processors. The benefit of this is you can break out of the traditional 4 core chips and go into 8 and 10 core as well as multiple CPU for a total of 40 cores+ with hyperthreading. You also get 40 PCI Express lanes, which allow you to run your GPUs without any bottlenecks. Even on a desktop 1-2 GPUs with newer boards (PCI Express 3) wonā€™t be throttled until you drop down to 4x which is typically when you have 3+ GPUs (single card runs at 16x, two cards at 8x, and 3 cards 8x 8x and 4x). Each NVMe drive also takes away 4x so if you run 3GPUs you will sacrifice the speed of two GPUs to use it (8x 4x 4x and 4x for NVMe).

Basically, the way I see it, I would not opt for option one (cheaper CPU/components and focus on high-end GPUs, Iā€™d just get a high-end GPU, you already at the thousands, another $150 to get the best CPU isnā€™t go into making a big difference in cost). If you are using it for interactive work or your main desktop, even more of a reason to do this. The only time I would recommend AMD or cheaping out on the CPU is if you are going low budget $500-$700 machine, cause it will allow you to get a super cheap box, but once you start getting up over $1000 I would recommend just getting the best CPU available.

If you want a dedicated server, you plan on going up to 4 GPUs and want the best setup you can get, and have a ā€œpower-houseā€ deep learning box, I recommend the Xeon route. Xeonā€™s are better overall but are poor choices for Interactive desktops and Game machines as the CPUs are slower for 1 thread processes (which most work is) but have a lot more cores but more important twice the PCI Express buses. If you do not plan on 3 or more GPUs, this isnā€™t the best option. If you plan on 1070ā€™s you can probably get 3 of them at full speed without even going Xeon with the 40 lanes because it just isnā€™t fast enough to bottleneck heavily even at 4x. If you are going to share with other people, or plan to do a lot of jobs in parallel, I would lean towards Xeon as well.

tl;dr Unless you building the cheapest system you can possibly build, donā€™t cheap out on the CPU and buy AMD or lower tier. The price difference is tiny and performance significant. If you build a multi-user server or plan on going 3+ GPUs, go Xeon unless it is your interactive desktop then you might want to stick with Intel 7700K or Intel Extreme.

4 Likes

CUDA (the nVidia technology for deep learning) cannot use SLI, SLI is purely for gaming only. That being said, nVidia is working on an equivalent ā€œSLIā€ for deep learning.

Just to clarify what SLI is, it is very simple. GPUā€™s plug into the motherboardā€™s PCI Express bus, those long slots they click into. This is what we refer to when we say (x8, 20 lanes, 4x and so on, this is how many lanes off the PCI Express bus).

SLI is nVidiaā€™s way to allow to GPUs to work together without communicating over the PCI Express bus, it is simply a small cable that sits on top of the GPUs to connect them directly. This allows higher speed communication between two GPUs and keeps the CPU out of the processing. Because in gaming all the processing can be split between two cards but in the end, it still has one output (the port the monitor is plugged into) so in the end, there is always a master that does the final work.

For deep learning, this doesnā€™t work, but the theory is very powerful and can make huge changes to the deep learning industry if they come out with SLI for Deep Learning (I think it is called NVL or something, it has a name and they are working on it). The reason this is such a big deal is you can have 4 Titan X cards that would look like 1 giant card of the system and instead of 10-70% performance gain per card with deep learning, in theory, you can get near 100% like SLI does for gaming. This would be a huge improvement, not only would tensorflow not need to split the job for multiple GPUs, it would scale linearly with each card you.

tl;dr SLI bypasses PCI Express to allow 2 or more GPUs to talk directly but only works for gaming. nVidia is coming out with ā€œSLI for Deep Learningā€ and dramatically improve multiGPU scaling to nearly 100% gain per GPU.

5 Likes

Definitely post it!

Iā€™ve got a question for you guys, if you could help me.

I want to follow this course, and I would like to take advantage of the GTX 970 that I already have and also to the fact that Tensorflow is officially supported in Windows now. Has anyone set the whole thing up for Windows? I mean, can this course be followed when not using AWS? In that case, which other things (apart from CUDA drivers) should I need to setup, and how could I do it?

Thanks in advance!

Hi @Estiui and welcome to the fast.ai community!

Itā€™s an honor to be the one to respond to your very first post :smiley:

Iā€™ll do my best to be open and honest in my feedback! I know the classic Unix vs Windows debate can get hot and heavy sometimes but letā€™s keep it civil ok

First of all, it seems like you have all the basic ingredients. Where it becomes complicated is in differences in the Windows OS kernel. Compared to Unix-based systems, itā€™s not all apples to apples under the hood so YMMV.

If you want to get into the specifics, a quick rundown would look something like:

  • Installing your GPU drivers
  • Installing the CUDA toolkit
  • Configuring the path variables
  • Installing CuDNN (this was tough on Mac last I remembered)
  • Installing Anaconda
  • Changing the conda.conf file to allow pip
  • pip installing tensorflow with this script
pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/windows/gpu/tensorflow_gpu-1.0.1-cp35-cp35m-win_amd64.whl
  • then just pip installing whateverā€™s needed (in my case, a hot cup of chai was needed as well!)

As a bonus, hereā€™s how to remotely configured Jupyter.

Good luck!

1 Like

Thank you very much Michael, Iā€™ll try to follow those steps!

Fair warning, some users recommend using Ubuntu. Just food for thoughtā€¦

Perhaps @dradientgescent can clarify on this

Just to clarify: ā€œPascalā€ refers to a whole generation of Nvidia GPUs, which includes the 1080 series and the most recent Titan X.

And itā€™s fine to have different cards, although if they have very different memory amounts it can get confusing.

970 will be able to do a lot of part I but will need some adjusting for CNN as it has low memory and some of the memory is crippled.

All of part 1 works on Windows but part II does not as tensorflow for 3.6 isnā€™t available for Windows yet and Windows in general causes a lot of headaches with most of Jeremyā€™s notebooks. Many of them require certain functionality in Linux when doing file operations and some of the libraries are a pain in the butt on Window.

I highly recommend avoiding Windows, it just adds far too many headaches but you can get away its little issues in part I just not part II.

Jupyter is much more responsive on Linux as well, training times are very similar but Jupyter is laggy on Windows doing things like opening notebooks and processing stuff.

First of all, thank you for your answer.

As usual, it seems that Windows is the least recommended option for scienceā€¦! So, let me redo my initial question. If I wanted to avoid Windows, could I make an Ubuntu installation under VirtualBox work with my GPU (the host system being Windows 10)? If that was not possible, could I follow the course without GPU in a VirtualBox VM with Ubuntu and with the Tensorflow CPU-only installation?

What I want, in the end, is to be able to set up and use my own GPU, as I already have it. My second option would be to know if it is feasible to follow the course with a CPU-only installation, as Iā€™d like to avoid AWS, which would be my last option.

Thanks again for your help!

If you are doing part I then windows is usable. It isnā€™t ideal as you need to change the notebooks a little when it comes to operating system commands via the % symbol.

Part II is not possible with Windows as even more file system issues and missing core
Libraries like tensorflow and pytorch for 3.6.

You can use virtual box but using Ubuntu bash for windows would be better. Although it has some limitations and you canā€™t use gpu.

Windows cannot pass gpu to Linux but you can pass gpu from Linux to Windows.

The best option if you do not do a dedicated box is to dual boot.

CPU is going to be really slow, costing nearly hundred
hours or more of wasted time.

You can look at Floyd, you get enough free hours to finish the course but it is extremely difficult to use and frustrating. AWS is better choice but will cost a lot more but can be mitigated by using spot images.

2 Likes

Hmmm I seeā€¦ well, then Iā€™m going to try to install Ubuntu alongside Windows, I should have done it years ago. Thanks for all the help!