"Illegal Instruction (core dumped)" when running fastai V1 on AWS

harpalss · January 23, 2019, 1:14pm

I’ve got a p2.xlarge instance running on AWS with the Ubuntu Deep Learning image. I then perform the following steps to get fastai working:

Install a python 3.7 environment (the default python 3 version is 3.6. Fastai v1 requires 3.7 for data classes)
Check CUDA version with nvcc --version. On the instance it returns: Cuda compilation tools, release 9.0, V9.0.176
Install the corresponding pytorch as stated here, in my case it was: pip install torch torchvision
Install fastai v1 with pip install fastai
Install Jupyter lab with pip install jupyterlab
Run jupyter lab with: jupyter notebook --no-browser --port=8888
I then SSH to my jupyter lab instance

I executed the following code in the console via jupyter lab:

As you can see everything imports fine, but it hangs on the last command create_cnn. Looking at the output logs of jupyter lab it states: Kernel restarted.

I then tried to run it in a python terminal and I get:

I now get Illegal instruction (core dumped). I’m guessing the error occurs when loading onto the GPU? any idea why this would happen?

I then tried to load the models manually:

The issue must be when it is loaded onto the GPU?

harpalss · January 30, 2019, 3:53pm

I’ve managed to get it working with a GPU on the Deep Learning Base AMI (Amazon Linux) Version 16.0.

Once you have the image up and running on a EC2 instance and SSH’d in, I performed the following steps to get it working:

Check with nvcc --version the image is using CUDA 9.0
pip3 install torch torchvision
pip3 install fastai

That’s it. I then installed jupyter and got a notebook up and running.