But I learn that Pytorch has now supported M1 Mac, so I attempted to install fastai in my M1 Mac. FYI, I use miniconda , then install pytorch and fastai separately by pip.
The installation was successful without any error message and I verified my installed PyTorch has access to M1 GPU, i.e.
In [5]: torch.backends.mps.is_available()
Out[5]: True
However, when I run this sample notebook for testing, I kept getting Kernel die error on the following line:
But then it got stuck at learn.fit_one_cycle(1), its training progress bar is stuck at 0%, with the following warning messages:
/Users/alex/opt/miniconda3/envs/fastai/lib/python3.9/site-packages/torch/amp/autocast_mode.py:198: UserWarning: User provided device_type of 'cuda', but CUDA is not available. Disabling
warnings.warn('User provided device_type of \'cuda\', but CUDA is not available. Disabling')
/Users/alex/opt/miniconda3/envs/fastai/lib/python3.9/site-packages/torch/cuda/amp/grad_scaler.py:115: UserWarning: torch.cuda.amp.GradScaler is enabled, but CUDA is not available. Disabling.
warnings.warn("torch.cuda.amp.GradScaler is enabled, but CUDA is not available. Disabling.")
[W ParallelNative.cpp:229] Warning: Cannot set number of intraop threads after parallel work has started or after set_num_threads call when using native parallel backend (function set_num_threads)
[W ParallelNative.cpp:229] Warning: Cannot set number of intraop threads after parallel work has started or after set_num_threads call when using native parallel backend (function set_num_threads)
[W ParallelNative.cpp:229] Warning: Cannot set number of intraop threads after parallel work has started or after set_num_threads call when using native parallel backend (function set_num_threads)
[W ParallelNative.cpp:229] Warning: Cannot set number of intraop threads after parallel work has started or after set_num_threads call when using native parallel backend (function set_num_threads)
[W ParallelNative.cpp:229] Warning: Cannot set number of intraop threads after para
tested to run jupyter with the --debug flag on, but it seems to not give any info. Will try with a custom python script outside juptyer fo detect any error messages.
I can’t explain why, but I couldn’t reproduce your error with your script
I managed to run your script successfully.
FYI, I am using PyTorch 1.12.1, M1 arm64 build, my fastai version is 2.7.10
Last time i tried using the M1 not all pytorch operations were supported yet for most of the more complicated libraries. Not sure if thats still the issue right now.
You still have that issue. I believe more operations are supported now but there’s still missing ones and every once in a while you’ll get either a warning that a certain operation will switch over to CPU, or an error asking you to set an environment variable, or the code will simply crash out. It’s getting better though …
This is so strange
Based on your script, I made another script and then run in command line.
The script did work out and I can did the training (fit_one_cycle) though at a pretty slow speed.
But when I run the lesson1 jupyter notebook again, it either crashed at
Remove aug_transforms from dataloaders (this is causing the issue)
You should be able to train on mps now
I haven’t found the exact reason the kernel dies with aug_transforms, haven’t had a chance to look into it. But hopefully this unblocks people for now!
For me on M1 mac the aug_transforms works with mps when limiting transformations like this: aug_transforms( do_flip=False, flip_vert=False, max_rotate=0.0, max_zoom=1.0, max_warp=0.0, )
Alternatively calling datablock.dataloaders(path, device = 'cpu') with arbitrary arguments of aug_transforms