Hello @ai_padawan
-
Do you mean Data Parallel or Distributed Data Parallel, or Model Parallel?
The first two are supported in many models in fastai2. -
To run in batch mode from command line, please see the examples in https://forum.ailab.unb.br/t/data-parallel-dp-e-distributed-data-parallel-ddp-training-in-pytorch-e-fastai-v2/194
-
For use inside a Jupyter session, I have made a library
mpify
for distributed function call in Jupyter notebook, and used it to port several fastai v2 notebooks to DDP training inside Jupyter.
Please see this post: Distributed training of course-v4 notebooks now possible
The fastai v2 notebooks I ported are in the examples directory. Some may need to be updated, if the fastai2 code base has undergone any major change in the past 3 months.
Pardon the documentation maybe a bit hard to read, I’m in the process of revamping that.
If you have problem using mpify
, please open an issue against the mpify
repo.
Thanks,
Phil