A guide to using your M-Series Mac GPU with PyTorch

For those who have an M-Series (M1/M2, etc) computer, I’ve written up a to-the-point guide on how to make use of its GPU in PyTorch for increased performance.

If you have any comments, questions, suggestions, feedback, criticisms, or corrections, please do let me know!


I tried this, but I think some operations are not yet supported on the PyTorch MPS devices.

NotImplementedError: The operator 'aten::cumsum.out' is not current implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on https://github.com/pytorch/pytorch/issues/77764. As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.

Could not circumnavigate this even after trying out the temporary fix :confused:

Yeah, certain operations and features aren’t implemented yet. Support is still in its early stages and more optimizations await.

You can check this GitHub issue to see which operators are not implemented, are being implemented, and which ones are implemented: https://github.com/pytorch/pytorch/issues/77764

So I don’t know if this is the right place, but I am getting a similar NotImplementedError, but the issue is that it persists even when I have the env variable for pytorch fallback set :frowning:

Would you mind sending your code snippet where you set the environment variable? I’ll try and see if I can figure out what’s happening.

You can also try posting it to or searching the PyTorch forums.