What makes fastai unstable in M1 Mac?

riven314 · October 21, 2022, 7:52am

from README, I saw fastai doesn’t naturally support installation on M1 Mac.

But I learn that Pytorch has now supported M1 Mac, so I attempted to install fastai in my M1 Mac. FYI, I use miniconda , then install pytorch and fastai separately by pip.

The installation was successful without any error message and I verified my installed PyTorch has access to M1 GPU, i.e.

In [5]: torch.backends.mps.is_available()
Out[5]: True

However, when I run this sample notebook for testing, I kept getting Kernel die error on the following line:

dls = ImageDataLoaders.from_name_re(
    path, fnames, pat=r'(.+)_\d+.jpg$', item_tfms=Resize(460), bs=bs,
    batch_tfms=[*aug_transforms(size=224, min_scale=0.75), Normalize.from_stats(*imagenet_stats)])

The same error persisted despite reducing batch size or image size so I believe the issue is not related to memory but something more fundamental.

Does anyone have experience using fastai in M1 Mac and encounter similar situations? I have no idea what causes the issue at this moment.

melonkernel · October 21, 2022, 8:03am

Let me check

riven314 · October 21, 2022, 8:07am

I found commenting out this argument from ImageDataLoaders helps:

batch_tfms=[*aug_transforms(size=224, min_scale=0.75), Normalize.from_stats(*imagenet_stats)]

So I guess it has to do with the transformation pipeline.

riven314 · October 21, 2022, 8:10am

But then it got stuck at learn.fit_one_cycle(1), its training progress bar is stuck at 0%, with the following warning messages:

/Users/alex/opt/miniconda3/envs/fastai/lib/python3.9/site-packages/torch/amp/autocast_mode.py:198: UserWarning: User provided device_type of 'cuda', but CUDA is not available. Disabling
  warnings.warn('User provided device_type of \'cuda\', but CUDA is not available. Disabling')
/Users/alex/opt/miniconda3/envs/fastai/lib/python3.9/site-packages/torch/cuda/amp/grad_scaler.py:115: UserWarning: torch.cuda.amp.GradScaler is enabled, but CUDA is not available.  Disabling.
  warnings.warn("torch.cuda.amp.GradScaler is enabled, but CUDA is not available.  Disabling.")

[W ParallelNative.cpp:229] Warning: Cannot set number of intraop threads after parallel work has started or after set_num_threads call when using native parallel backend (function set_num_threads)
[W ParallelNative.cpp:229] Warning: Cannot set number of intraop threads after parallel work has started or after set_num_threads call when using native parallel backend (function set_num_threads)
[W ParallelNative.cpp:229] Warning: Cannot set number of intraop threads after parallel work has started or after set_num_threads call when using native parallel backend (function set_num_threads)
[W ParallelNative.cpp:229] Warning: Cannot set number of intraop threads after parallel work has started or after set_num_threads call when using native parallel backend (function set_num_threads)
[W ParallelNative.cpp:229] Warning: Cannot set number of intraop threads after para

melonkernel · October 21, 2022, 8:11am

The library should in my opinion support mps

a lot of notebooks i have seen online only checks if cuda is available though, so i have ad to add checks for mps support.

riven314 · October 21, 2022, 8:16am

Thanks @melonkernel and good to see that the library takes into account mps device.
I am still trying to figure out what went wrong with my setup.

Did you personally use fastai in M1 Mac? (If so, did it work fine for you when running the course notebooks?)

FYI, I noticed others encountered the same issue with their M1 Mac (i.e. Beginner: Setup ✅ - #159 by knivets)

melonkernel · October 21, 2022, 10:59am

I have not run them for the fastai course, but for some other uses.

melonkernel · October 21, 2022, 11:03am

My kernel also died

melonkernel · October 21, 2022, 11:07am

tested to run jupyter with the --debug flag on, but it seems to not give any info. Will try with a custom python script outside juptyer fo detect any error messages.

melonkernel · October 21, 2022, 11:12am

If i run this from command line (with the same conda env i used for the notebook)
it will complain about:

-:27:11: error: invalid input tensor shapes, indices shape and updates shape must be equal
-:27:11: note: see current operation: %25 = “mps.scatter_along_axis”(%23, %arg0, %24, %1) {mode = 6 : i32} : (tensor<634800xf32>, tensor<460xf32>, tensor<211600xi32>, tensor) → tensor<634800xf32>
/AppleInternal/Library/BuildRoots/a0876c02-1788-11ed-b9c4-96898e02b808/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphExecutable.mm:1267: failed assertion `Error: MLIR pass manager failed’
[1]

melonkernel · October 21, 2022, 11:16am

The autocast is not supported on m1 i think.

riven314 · October 21, 2022, 5:56pm

I can’t explain why, but I couldn’t reproduce your error with your script
I managed to run your script successfully.
FYI, I am using PyTorch 1.12.1, M1 arm64 build, my fastai version is 2.7.10

melonkernel · October 22, 2022, 4:39am

Ok great. I had done some changes to my cona env, What python version are you using btw?

riven314 · October 22, 2022, 5:02am

Python 3.9.13, arm64 build

ItsJeffrey · October 22, 2022, 8:25am

Last time i tried using the M1 not all pytorch operations were supported yet for most of the more complicated libraries. Not sure if thats still the issue right now.

Fahim · October 22, 2022, 10:44am

You still have that issue. I believe more operations are supported now but there’s still missing ones and every once in a while you’ll get either a warning that a certain operation will switch over to CPU, or an error asking you to set an environment variable, or the code will simply crash out. It’s getting better though …

riven314 · October 24, 2022, 11:04am

This is so strange
Based on your script, I made another script and then run in command line.
The script did work out and I can did the training (fit_one_cycle) though at a pretty slow speed.

But when I run the lesson1 jupyter notebook again, it either crashed at

dls = ImageDataLoaders.from_name_re(
    path, fnames, pat=r'(.+)_\d+.jpg$', item_tfms=Resize(460), bs=bs,
    batch_tfms=[*aug_transforms(size=224, min_scale=0.75), Normalize.from_stats(*imagenet_stats)])

or crashed at

learn.fit_one_cycle(1)

I guess it is not related to unsupported pytorch operations on M1 mac, but something specific to jupyter notebook

justinchangmusic · March 12, 2023, 4:07pm

For those looking for solutions in 2023:

Setup: mamba create -n fastai python=3.10 && mamba activate fastai && pip install fastai transformers timm jupyter
Remove aug_transforms from dataloaders (this is causing the issue)
You should be able to train on mps now

I haven’t found the exact reason the kernel dies with aug_transforms, haven’t had a chance to look into it. But hopefully this unblocks people for now!

januvojt · June 16, 2023, 11:30am

For me on M1 mac the aug_transforms works with mps when limiting transformations like this:
aug_transforms( do_flip=False, flip_vert=False, max_rotate=0.0, max_zoom=1.0, max_warp=0.0, )

Alternatively calling datablock.dataloaders(path, device = 'cpu') with arbitrary arguments of aug_transforms

sid.sarasvati · September 21, 2023, 8:31pm

this worked for m1 Mac for me i.e passing those args to aug_transform