Building deep learning workstation for $2200


Gave the examples/text notebook a go for AWD-LSTM, but the model gave an error when I tried to move it to fp16:
“Expected tensor for argument #1 ‘indices’ to have scalar type Long; but got torch.cuda.HalfTensor instead (while checking arguments for embedding)”
Points to torch/nn/ so appears to be upstream of fastai.

(Andrea de Luca) #23

Sound quite related to fp16…

Keep us posted, should you manage to get it working!

(Scott H Hawley) #24

Big relief to read you says “28 PCIE lanes should be ok for three GPUs”.
Have you built your system – how is it?

I’m building a system with an i7-7800X and two RTX 2080Ti’s, but got worried that since the CPU can’t do full 16x twice, that I’d hit a bottleneck. (I’m used to working with Xeon CPUs which have lots of PCIE lanes)

(Andrea de Luca) #25

According to Tim Dettmers, who ran a series of systematic experiments, x8 gen3 does actually provide sufficient bandwidth for 2-3 cards. Note also that a single xeon cannot provide the 48 lanes necessary to run 3 cards at full x16 (skylake xeon-w could, but 4-8 lanes are always used for other stuff).

(Scott H Hawley) #26

Thanks, I saw that too (after I posted here)!

Just posted a “Completed Build” on PCPartpicker:

The budget for this one was $4000, so I got extra RAM and a second GPU.