Azure FastAI Kernel not using GPU

(Hanman) #1

Following this tutorial, Using Azure FastAI I created this instance and everything seems to be working. Just that, training was awfully slow… Like really slow. Upon further investigating, I realised that Python(fastai) kernel was not using the GPU.

import torch

When I changed to another kernel, like Python 3.6 (AzureML), the above statement returned True.
Anyone knows, how I can activate the GPU in the kernel.

Thanks folks.


Are you using Linux or the Windows edition of the Data Science VM?


Actually looks like the latest version of Pytorch now also expects and installs a local cudatoolkit package. By default it is CUDA 10. On the DSVM (K80) looks like CUDA 9 only works. You can downgrade to the cudatoolkit 9.0 by running the command:

conda install pytorch torchvision cudatoolkit=9.0 -c pytorch

SageMaker is very slow
(Hanman) #4

I used the Linux edition.


Above solution should work if you want to use existing data science VM. I fixed script so new VMs created will have correct cuda for the DSVM.

(Hanman) #6

Thanks @zenlytix… I just created a new instance now, and it works…

(Tony) #7

This is how I did it using a Linux Ubuntu VM for anyone who doesn’t want to create a new one with the fix and would prefer to fix it inline and is unfamiliar with how to connect to the machine / what is happening behind the scenes with Jupyter notebooks.

  1. Open a terminal (cmd, powershell, git bash, etc)
  2. ssh instead the machine with the command
    ssh <your username>@<ip address>
    This is the same IP address you use to connect to the Jupyter notebook and same username.
  3. Enter the same password you use to connect to the jupyter notebook
  4. Run the command:
    conda activate fastai
  5. Run the command provided by zenlytix:
    conda install pytorch torchvision cudatoolkit=9.0 -c pytorch
  6. If you have a Jupyter notebook running, restart the kernel (from the menu bar at the top)
  7. Insert & run a cell at the top of your jupyter notebook to verify the installation worked
    import torch
    This should return true and you can continue with the notebook as usual.