AWS is very over-priced for GPUs. You can buy a GTX 1070 for around $300 that gives better performance than the AWS P2’s GPU. So I think it’s a good idea to build your own deep learning machine if you can.
It is definitely possible to put together a home system – mine is Ubuntu based. I log into it remotely, interacting with jupyter from another machine to minimize resource usage on the GPU server.
NVidia has a GPU grant program for academics – not students, but PIs ( https://developer.nvidia.com/academic_gpu_seeding). I’m not eligible, but I’m working on inspiring a PI to write a grant and let me put together a machine for them. Will keep you posted.
The pain with aws was getting registered (still not after a long support chain), followed by the constant vigilance of turning the machine off.
It only makes moderate economic sense. At some level this a gym membership where I have to promise myself that I use the server quite a bit to justify - basically a few months worth. There isn’t that much difference between weight training and training weights from a cash perspective.
The coworker is in the udacity program and is pretty excited about it.
@arthurconner if you ask @datainstitute for help on Slack they should be able to get your AWS set up, if you’re still interested in doing that.
@lin.crampton frankly I’m not sure that program is worth it at the moment, since they don’t give out Pascal-based cards; also, all they provide is the card, not the rest of the server. Pascal cards are such a big step in performance, and (if you get a 1070 or even a 1080) not a huge chunk of the cost of the server - so I’m not sure how much benefit it provides to get their grant…
I set up my own Ubuntu 16.4 machine with a GTX 1060 -6GB (i wish i had gotten the 1070 with 8GB since it ran out of memory on the first lesson)
After installing Ubuntu my setup was roughly the following:
Install CUDA 8.0 and cuDNN
Anaconda and Python
sudo apt install unzip
conda create -n fastai34 python=3.4
source activate fastai34;
conda install matplotlib
conda install cloudpickle
conda install opencv
conda install pandas
conda install bcolz
conda install scikit-learn
conda install theano
conda install keras
conda install jupyter
#switch keras to user theano
#create .theanorc ->
floatX = float32
device = gpu0
fastmath = True
unzip -q vgg16.zip
unzip -q utils.zip
modify the lesson1 notebook
“from imp import reload” above “import utils; reload(utils)”
#On login, switch to your fastai34 env
source activate fastai34;
#Run jupyter notebook remotely
ssh -L 8888:127.0.0.1:8888
<machine address >
jupyter notebook --no-browser &
Awesome. What sort of performance are you getting compared to the AWS basic GPU?
What’s the best way to benchmark the performance?
In the first lesson it took 318 seconds to fine-tune with batch_size=32.
It ran out of memory with batch_size=64
Does anyone have a recommendation on what kind of motherboard I should get if I want to be able to use multiple GPUs (2 for now)?
And is there a specific version of the 1070 I should try to get?
That sounds reasonable. You could compare runtimes / memory issues with our AWS GPUs since we know those are working for most use cases in this course.
most builds I’ve seen use Gigabyte GA-X99-Ultra which would allow you to use up to 4 gpus…
A couple of interesting builds:
MSI and EVGA are the most popular due to performance and warranty. I am a big fan of MSI but EVGA has an excellent warranty.
I get 229s with my 1070 on lesson one’s first fit. On AWS P2 I believe it is around 650s.
Setting up your own machine is really easy. Even a 1060 (sub $200) will blow the doors of AWS P2 instance and will be no monthly costs. The 1070 is the sweet spot but will go for around $400. I highly recommend using Linux as Windows is always fighting against the grain when installing dependencies and linux just works faster and smoother. Especially Jupyter notebooks, the difference is huge.
Do I need a powerful CPU to set up a server? Or CPU is not doing any work when training with CUDA?
Consensus is that the CPU doesn’t need to be that powerful, as long as you can support enough processes per GPU (and depending on what kind of preprocessing you need to do).
I’ve seen the Intel 6700 in a bunch of builds so that seems reasonable (and you could probably get by with less if necessary).
Here are the components I chose.
It isn’t CPU bound when training on GPU, but it will peg one core 100%. So single threaded performance does factor in a little. More importantly is the bus, using PCI Express Gen 3 is 10-15% improvement over the same card, and using a faster CPU can be anther 10-15%.
For example, I had a 4.3GHz overclocked Ivy Bridge (Intel 3770K) with an nVidia 1070 and would get around 324s times training lesson 1 first fit. When I upgraded to a Kaby Lake 7700k using the same GPU but faster CPU and Gen 3 PCI Express, my times dropped nearly 30% to 229s.
When I train on GPU I can see one thread at 100% usage. So CPU does factor in a significant amount with GPU training but you will already be running quite fast, it’s just the difference between fast and faster and no where near as important as the GPU in general.
Good choices, I would recommend going Kaby Lake though and if you can just spend the extra $100 on the 7700K. You can also get a Z270 which is the newer Intel chipset for near the same money.
I’d also recommend swapping out the Power Supply for an EVGA Gold G2, it’s about $10 less but it is a much better Power supply and has a $20 rebate (just got one myself).
Good call. I’ll take a look.