Has there been any implementation of multi-GPU training in fastai v2?
I tried searching for it, but it looks like the fastai docs website has been error-ing out? For example: https://docs.fast.ai/dev/gpu.html
Has there been any implementation of multi-GPU training in fastai v2?
I tried searching for it, but it looks like the fastai docs website has been error-ing out? For example: https://docs.fast.ai/dev/gpu.html
I believe there were some propagation issues earlier. I have not tested the code, but this link works for me now.
Hello @ai_padawan
Do you mean Data Parallel or Distributed Data Parallel, or Model Parallel?
The first two are supported in many models in fastai2.
To run in batch mode from command line, please see the examples in https://forum.ailab.unb.br/t/data-parallel-dp-e-distributed-data-parallel-ddp-training-in-pytorch-e-fastai-v2/194
For use inside a Jupyter session, I have made a library mpify
for distributed function call in Jupyter notebook, and used it to port several fastai v2 notebooks to DDP training inside Jupyter.
Please see this post: Distributed training of course-v4 notebooks now possible
The fastai v2 notebooks I ported are in the examples directory. Some may need to be updated, if the fastai2 code base has undergone any major change in the past 3 months.
Pardon the documentation maybe a bit hard to read, I’m in the process of revamping that.
If you have problem using mpify
, please open an issue against the mpify
repo.
Thanks,
Phil