It's Alive! My Deep Learning Rig for Part 1

(Phil Weslow) #1

I started Part One of course roughly a month ago. My initial plan was to utilize Paperspace as my practice environment. However, my request for an Ubuntu instance never seems to have been approved/responded to. For a couple of weeks I utilized Crestle to get up and running. I greatly appreciate the “push button” simplicity that Crestle provides in terms of launching an environment.

With continued usage, I began to experience some frustration with training times, as well as a few experiences I had where the system hung while trying to spin up an instance. I tried signing up for an Amazon P2 Large instance, but the response I received from Amazon made me feel like I would have to jump through hoops to get up and running.

All this motivated me to take on the recommended exercise in Lesson 8 of building a Deep Learning Rig/AI Sandbox. This was my first ever PC build and I benefited from having multiple PC Hardware retailers in my area that provide price matching with components from online retailers.

Parts

I greatly appreciate @jeremy suggestion of using PC Part Picker to ensure component compatibility. My part list for the build was follows:

PCPartPicker part list

Including the experience of botching my first attempt at thermal paste application and subsequent cleanup, the hardware portion of the build took me the better part of a day. In regard to setting up the environment, after installing Ubuntu 16.04 and the NVIDIA Driver, the rest was a snap thanks to the docker image that @Matthew provided along with the recommended setup instructions.

Many thanks to all the others who have posted excellent build guides and tips both on the forum and via blog posts. Their inspiration gave me confidence that the project was definitely doable! So far, I have been extremely happy with the speed of the environment and the convenience of having my own box.

15 Likes

Adapting Dog-Breed Classifier to Kaggle Retina Challenge
(Moustapha Cheikh) #2

congrats :grinning:

2 Likes

(Parthasarathy Mohan) #3

Congrats…

1 Like

(Hugues) #4

Hey @P_Wes Phil,

When you run the 3 epoch in file Lesson1.ipynb of course 1, how much time does it take on your machine ?
In Paperspace, on P4000 machine, it takes around 17sec.

I’m considering building my own rig, my attempt to use my Macbook Pro with GE Force 750M GPU card are going nowhere, it’s a dual graphic cards and not sure it’s possible to get Ubuntu and the Nvidia driver to run.

let me know

thanks

0 Likes

(Andrea de Luca) #5

Congrats. :slightly_smiling_face:

But do not use the spinning drive for the active datasets, it won’t be able to feed the GPU at an acceptable rate.

0 Likes

(Phil Weslow) #6

Yep, the 2TB drive is just used to store extra data sets until I am ready to work with them. At which point they get transfered to the SSD.

1 Like

(Jason Patnick) #7

@P_Wes I’m glad I came across this, I was planning to build something pretty similar. How is the i3 handling everything? It’s there anything you would change about the set up if you were to do it again?

0 Likes

(Phil Weslow) #8

That’s a good question! I was actually quite concerned about the i3, which I picked for budget reasons. So far, working thought Part One, I have not noticed any problems. While skimming through the lectures for Part Two, I got the impression that some of the Notebooks might give slower CPUs some issues, but we’ll see. My preference would have been for an i5, also as my primary drive I would do a NVMe as opposed to a standard SSD.

The number one change I would make though, is that I would probably purchase a GTX 1070ti mini if I had to do it again. I’d start with the card and build the system around it. In fact, I really wish I knew about the existence of the “minis” before I started my build. They really allow you to ratchet up the cost/performance ratio.

This is not to say that using my 1060 is awful, far from it! I prefer it to any cloud-base solution I have worked with. One of the model’s I’m working on right now (not part of Fast.ai), takes about 60 min to train each time I adjust the parameters. If I could get it done in 40 min with a 1070ti, I’d take it!

I planed out my current build knowing I had the intent of upgrading components down the line. One of my first swaps will probably be the processor, which will be cheap and easy.

3 Likes

(Jason Patnick) #9

Thanks for the info! Are you planning on accessing the desktop remotely with a laptop at all?

0 Likes

(Phil Weslow) #10

Absolutely, I remote in all the time! You can set up SSH (ideal for not waisting GPU resources) or visually remote in with software like Team Viewer.

0 Likes

#11

I made whole part 1 on i3, 1060 6gb, and 8 GB RAM, didn’t have any problems. (I got PC box, with i3 already, just bought 1060).
Sometimes I wished for more GPU memory, and regretted not buying 1080i. But now, with cloud services (Google Cloud, AWS, paperspace etc), I think that is smarter to just compute things online, if my 1060 can’t handle it; then that buy 1080i.
@P_Wes what was your problem with cloud? Or you just just mean that it more convenient to do this own box?

I don’t have public IP, so no SSH for me; so that is would be additional hassle. I’m using Team Viewer thou, but it feels slow.

0 Likes

(Andrea de Luca) #12

How did you get that impression?

AFAIK, as long as you have 1C/2T per GPU, you’ll be fine.

0 Likes

(Jason Patnick) #13

nice thats what i’m planning on doing. two more questions haha what laptop are you using and are you running ubuntu on that also?

0 Likes

(Dien Hoa TRUONG) #14

I have the same question as @sayko. Why don’t you like the cloud ?

I am doing this course with my laptop HP Omen 17-w102nf. It also has the GTX 1060-6gb. But I have impression that it is quite slower than one Jeremy using in the course.

I’m considering to use my laptop now just for learning and testing. When the training process take so much time, I maybe use a cloud like PaperSpace.

Does the cheapest GPU in PaperSpace P4000 is faster than GTX 1060 ?

Thank you for your help.

0 Likes

(Phil Weslow) #15

@pattyhendrix No, it’s a MacBook Pro that I use to connect.

@balnazzar Something I came across while skimming the Part 2 lectures. I’ll try to remember to post here when I come across that part again.

@sayko Part of the answer why I have not been a fan of cloud computing for Deep Learning can be found in my first post on this thread. My experiences with three different services where all less than idea. I realize that for training certain models their may be no way to avoid it. But I’ll do as much as I can from the convenience of my own rig.

1 Like

(Jason Patnick) #16

so youre using a macbook pro running os x and ssh into your desktop running ubuntu?

0 Likes

(Phil Weslow) #17

Primarily, I either access the machine locally or remote in using Team Viewer from the Mac.

0 Likes

(Theodoros Galanos) #18

When they were first out, 2x 1060 were faster than a 1080GTi, that is in graphics applications. That’s why NVIDIA disabled their SLI, since 2 of those were cheaper than a 1080.

I have the same card, different model. I like it. I might buy another one, although might be better next step to upgrade. Depends on the datasets I’ll be working with in the coming months.

0 Likes

(Andrea de Luca) #19

Ti, not GTi :slight_smile:

However, the main problem with the 1060 is not speed, but memory.

With the 1080 ti, I experienced something like 65% memory occupation (over 7Gb) while running fastai notebooks in conjunction with fruits dataset. It is a big dataset, but not enormous. Also, the NN architecture impacts on memory occupation.

If you want to put together two 1060, it is a good choice in absolute terms. But to get twice the speed AND memory, be prepared to meddle with pytorch parallel API, since fastai does not support parallelism yet.

For a 1-GPU system, I think the bare minimum is the 1070, at least if you want to do something more than barely follow the lectures.

If you want to circumvent the memory issue, you can always lower down the batch size. In this case, you have to experiment with the optimal learning rate, which heavily depends on the batch size. If your minibatches are too small, you could never attain the same accuracy as the ones who use larger batches, no matter how much you optimize the LR.

For something like 150$ more, go for the 1070.

0 Likes

(Phil Weslow) #20

To be honest, the tone of your post comes off as a bit dismissive. This build required a significant amount of time, energy and resources, and I’m quite proud of it!

Obtaining access to GPU computing time is a potential hurdle for everyone trying to get into the field of Deep Learning. Therefore, there is inherent value to a multiplicity of approaches. Since Neural Networks generally do better with larger datasets, in some cases without an upper-bound in sight, all of us, as Deep Learning practitioners, will likely come across projects where skill at optimizing both batch size and learning rate will come in handy!

Also, I’ve been working on the Kaggle Diabetic Retinopathy challenge using the Fast.ai library. It’s an absolutely enormous dataset, over 100 GB in total size. Such an incredibly data intensive project represents far more than just ‘barely follow[ing] the lectures’.

Also, it’s very easy for someone with a build like mine to swap out a graphics card, when they are ready to upgrade. If they did not know they had the option to use something like a 1060, then that might forestall them getting into the field of Deep Learning. There is no time like the present, and we all know what happens to endeavors we push off too long.

Why not expose people to all their options? If you are trying to communicate some insight, there are ways to do it without adopting a dismissive tone.

0 Likes