Can folks also share how much time it takes to build and configure your own box? hardware debugging is very scary to me, I’d like to have some realistic expectations, i.e. time expectations on building your own box for dummies.
Has anyone successfully used 2 GPUs in one box? Based on a preliminary Google search, seems like parallelizing your programs across multiple GPUs is a bit of a pain.
I successfully cajoled a few professor friends of mine to apply for this nVidia academic grant, for which they are offering free Titan X Pascal GPUs if you’re accepted, which means I may end up having more than one on hand.
Processing data in paralllel actually isn’t that hard, you just end up using a smaller batch size on each GPU and concatenating the result on the CPU - kuza55 created a script to do it automatically on github.
Model parallelization is a bit trickier but also possible. You can assign different layers of your model to different devices relatively easily in TensorFlow (just use with tf.device('/gpu:%d' % n): where n is your gpu # (e.g. 0). Haven’t been able to get much of a speed improvement with this yet.
As mentioned in class and the ppt, 2 GPUs are very helpful for running one experiment on one GPU and letting you continue to work on a notebook on the other GPU. So if you can afford it (or you get the grant!) it’s a great idea.
It might be better if we all migrated this discussion to the main forum at Making your own server . That way we’ll be able to get help from more people - there’s nothing really part2-specific about this thread…