The hardware is not complex, the biggest complexity is having the optimal and compatible parts. That you can get a lot of help with by verifying your build with PC Parts Picker or Reddit BuildaPc subreddit before purchasing. The actual building is really easy, even the first time and there are lots of tutorial videos.
The most important thing with a laptop is an nVidia 10xx card and SSD drive. Regular hard drives are insanely slow on laptops as they use even slower drives. That being said, not all SSD drives are the same, there is a huge difference in performance. PCI Express/NVMe drives are best you can get (3000MB/s read/writes!) but are twice the price of SATA SSD drives, but much more than 2x price when bought with a machine. SATA SSD drives will cap at 550MB/s read/write but vary tremendiously from 100Mb/s to 550MB/s and mixmatched read/write speeds. Some research can usually give you an answer to what to expect from a particular laptop. Apple tends to use high speed (900MB/s+) SSD drives, were most other manufacturers use SATA in the 300MB-400MB/s range, unless they state PCIE or NVMe.
Yeah, GPU memory is the bottleneck.
Unless you plan on having more than 4 Titan cards, you don’t really need more than 64 GB (with data generators and a good SSD). There are a bunch of decent options for working with out of memory datasets.
thanks for your insight, @dradientgescent ! will be on the lookout for those specs.
jermey was recommending for more than 64 gb ram , so confused.
Hm, are you sure you aren’t thinking of the size of the hard drive in an AWS instance?
Think there was a conversation earlier about that being the minimum size for that, but storage is cheap for your own server (512 GB SSD is just a bit over $100).
I put 64 GB of RAM in mine, but would have been fine with 32.
I remember reading a recommendation somewhere of 2x your total GPU memory (2x8 for me), although that seemed a bit arbitrary and you can probably get by with less than that.
I finally got my machine up and working last week. Thanks to all of you on this thread for your inputs!
For those of you, who want to have their own deep learning machine but a little unsure of the time investment in setting that up, here is a blog post that I put together after my experience of going through it: https://medium.com/@sravsatuluri/setting-up-a-deep-learning-machine-in-a-lazy-yet-quick-way-be2642318850#.frf3lcdlk
Cool write up. Didn’t realize those Central Computers people were so young.
Lots of dynamic dns options out there in case your IP changes, so far http://www.noip.com is working for me but I’m sure other people will have other recommendations.
I’d imagine it’s not too hard to spin up your own with an AWS micro instance and Elastic IP.
LOL. Yes, the youngest is 2 year old
He totally looked like he knew what he was doing, I would have let him take a crack at it.
Thank you for this, this /.theanorc file was what I needed too, with one very important twist:
missing the 0 stopped my gpu from being accessed, and made me go crazy re-installing cuda a few more times
I’ve created a collaborative list here for all the parts needed for setting up a Deep Learning/ML box.
With the approval of @brendan, I took some of his recommendations, as well as others in the forum.
The lists are collaborative, and anyone who signs in should be able to edit the description and add items.
I would love to see some convergence for the best setup with few alternatives to select the budget range.
Disclaimer: I’m the founder of Giraffe List
Hi Davecg, with such a powerful system you’re rocking I wonder what your times for dogs vs cats was compared to AWS? Building my own soon, and curious what to expect.
I’m not sure what the best way to benchmark things would be and I haven’t been using VGG16 much (always at least VGG19 or one of the google architectures) with at least two dense or convolutional trainable layers at the end.
Those are usually ~240 s on the dogs vs cats dataset per epoch.
Multiple GPUs hasn’t been a huge advantage for training individual models, but it does let me train two things at once which is great. (I’m sure with a bit of optimization I could do better, but there is either enough overhead from splitting data or the model across GPUs or my CPU isn’t preprocessing data fast enough to keep both GPUs busy).
I’m running my scripts on docker containers (building from the Dockerfile and Makefile distributed with Keras) so it is very easy to isolate a container to one gpu or the other.
E.g. Run a script in one on the first gpu and host my notebook from another on the second gpu.
All you have to do is set the NV_GPU environment variable before calling nvidia-docker.
We probably should come up with a standard benchmarking script to compare builds - I’m sure other people have better setups than me.
I like to use the first epoch of lesson1 (initial fitting for cats & dogs to compare since I know them and we have been using that a lot on the forums as the comparison). I know AWS will get around 650s per epoch doing that, a 1060 will get around 350s and a 1070 as low as 229s. The new 1080Ti I’m calculating you can get that down to 140s or lower.
As for multiple GPU, cats and dogs won’t parallelize well because it is using Theano which is really poor for parallelization (it is very hard to enable, and requires a lot of changing of your code base).
Part II I think multiple GPU will be much more successful as Tensorflow has much better multiple GPU support and there is a module you can install that allows you to parallelize without any change of your codebase.
I just got my own machine all set up yesterday. Self built ubuntu 16.04 with a Gigabyte 1080 FE card. I didn’t note down my time for cats and dogs, but it was in the mid 200s.
nvidia-smi -l 5 in the terminal and it was really nice. I could see the memory being reserved when I run
from theano.sandbox import cuda – I’ve set up
cnmem = 0.9 in my
~/.theanorc file because I’m also driving the monitor from the same card.
It doesn’t take long for the card to get up to 80° and yet the GPU fan only goes to 53%. My understanding is that the default fan profile on these cards is designed for gaming and not training neural nets.
Does anybody know how adjust the fan profile?
Thanks for that. I’ll try it tonight when I get home.
I’ve been using TensorFlow the whole time.
Had been working with Keras with TF backend before discovering course, so adjusted things to that when working through part 1.
I haven’t really tried multi-GPU too much, mainly running separate experiments on each. Data parallelization is pretty easy in TensorFlow, haven’t quite figured out the best way to do model parallelization yet though.
I think a large amount of ram is important for the dense layers. I am running the lessons without a GPU and an epoch takes 18000s due in part to memory swapping. It requires around 25GB of total memory but I only have 16GB available. I do have other processes open; many web browser pages which could be closed. I could add a cuda GPU but I am still limited with Ram, without major upgrades extra CPU, different apple bios etc
I found the MNIST ipynb most intuitive as epochs take under 10 minutes.
So Iam off to build a server has anyone suggestions of ram requirements 32, 64, 96 128GB