Distributed SGD Training

Even · June 16, 2017, 9:02pm

There’s a new paper out from FB research about parallelizing SGD across multiple GPUs in large batches that allows them to train Resnet on the full imagenet dataset in 1 hour. Pretty cool stuff!

niazangels · June 17, 2017, 7:30am

I suspect this xkcd comic will not be as funny in a few months: