MultiGPU sharding w/ Fairscale, other experiences w/ multigpu?

After reading this https://towardsdatascience.com/sharded-a-new-technique-to-double-the-size-of-pytorch-models-3af057466dba it would be interesting to port Fairscale to do multigpu sharding to fastai.

What have been your multi-gpu experiences w/ fastai?

3 Likes