Relative importance of hard drives in training NNs

balnazzar · June 2, 2018, 4:15pm

I’m in the process of building a new rig for DL, and I’d like to know whether someone has tried to benchmark the performance of NVMe drives versus regular sata ssds during the training process, particularly with large datasets.

I’m asking this since the minibatches have to be fed to the GPU(s) from the hard drive, so I assume (perhaps incorrectly?) that a relatively slow ssd could be a significant bottleneck no matter the amount of ram, and/or how powerful your GPU could be.

Thanks.

tensoralex · June 3, 2018, 9:19pm

I haven’t done a benchmark, but I’ve seen throughput during training sometimes goes up to 600-900 Mb/s which regular SSDs wouldn’t able to support. It’s all depends on how specific dataloader works, type of training, libraries, CPU power and GPU power.

balnazzar · June 3, 2018, 9:44pm

Thanks, good reference. I didn’t imagine almost 1 Gb/s.