Fastai on Apple M1

Hi Everyone,

Has anyone successfully used fastai on an Apple M1 chip? Can you please share your experiences?


1 Like

I’m using components of it to build out dataloaders, etc, but haven’t actually tried training models with it or running inference with it. So I can only say that, from a barebones standpoint: fastai installs fine via pip, and its components work.

Ok, interesting. Can you please tell me if you are using Air or Pro? And, if possible, would you please train a simple fastai model and see if it works.

Air. Training models works. (It’s using the CPU.)

Hi vahuja4 hope all is well!

Below are a few links talking about the Apple Silicon M1.

mrfabulous1 :smiley: :smiley:

my training use MI is working very smooth, there no issues
roommebest regards

@jamesp, @gunturhakim - thank you for your reply! Good to know that we can train fastai models on an M1 without any issues. Could you also comment on the training speed? Is it very slow as compared to google colab? Thank you!

It doesn’t use the GPU. So you can design models locally, but I wouldn’t try to train them locally because of how slow it would be.

@ducha-aiki is playing with the M1 macbook and pytorch, he has made some progress!

The M1 GPU does not work with CUDA code. Therefore your training code for fastai / pytorch will be done on the CPU. If you want to use the GPU you’ll have to use one of Apple’s tools, like CoreML or CreateML.

The following is run from the M1 I’m typing on now:

[ins] In [1]: import torch
[ins] In [2]: torch.cuda.is_available()
Out[2]: False
[ins] In [5]: torch.cuda.get_device_name()
AssertionError                            Traceback (most recent call last)
<ipython-input-5-7c4a4a7ea8b8> in <module>
----> 1 torch.cuda.get_device_name()

~/nbs/venv/lib/python3.7/site-packages/torch/cuda/ in get_device_name(device)
    274         str: the name of the device
    275     """
--> 276     return get_device_properties(device).name

~/nbs/venv/lib/python3.7/site-packages/torch/cuda/ in get_device_properties(device)
    304         _CudaDeviceProperties: the properties of the device
    305     """
--> 306     _lazy_init()  # will define _get_device_properties
    307     device = _get_device_index(device, optional=True)
    308     if device < 0 or device >= device_count():

~/nbs/venv/lib/python3.7/site-packages/torch/cuda/ in _lazy_init()
    162                 "multiprocessing, you must use the 'spawn' start method")
    163         if not hasattr(torch._C, '_cuda_getDeviceCount'):
--> 164             raise AssertionError("Torch not compiled with CUDA enabled")
    165         if _cudart is None:
    166             raise AssertionError(

AssertionError: Torch not compiled with CUDA enabled

Yes, but no training, no GPU usage, inference and tutorial writing only.
Here is speed test:

A I have benchmarked the CPU speed of (patch extraction + HardNet, 3206 descriptors) on x86 (Rosetta) and arm64(native) with kornia

native: 3.46 s
x86: 7.78 s

Colab CPU: 6.43 s
Colab GPU: 141 ms

So, while it is nice and fast for CPU, it is nowhere near any GPU performance.