Setup on Azure

Hi,
I was able to setup fastai on Azure. Here are the steps that I followed -

  1. Provision Deep Learning VM - https://docs.microsoft.com/en-us/azure/machine-learning/data-science-virtual-machine/provision-deep-learning-dsvm
  2. Select OS - Linux / Ubuntu
  3. Select VM Type - NC6v2 - This has Nvidia Tesla P100 GPU (All the options are available @ https://docs.microsoft.com/en-us/azure/virtual-machines/linux/sizes-gpu )
  4. Region can be “East US”, “South Central US” or “West US2”, since most VM types are available there
  5. SSH into VM once provisioned
  6. Fast.ai requires Python 3.6. Select Python 3.6 environment -
conda activate py36
  1. Install fast.ai
conda install -c pytorch pytorch-nightly cuda92
conda install -c fastai torchvision-nightly
conda install -c fastai fastai
  1. Switch to notebooks directory -
cd notebooks
  1. Clone fastai repo -
git clone https://github.com/fastai/course-v3
  1. Access Jupyter notebooks using “https://<VM_DNS_NAME>:8000/”
  2. After login in Jupyter, Browse to " / fastai / course-v3 / nbs / dl1/ lesson1-pets
  3. In the menu , select “Kernel -> Change Kernel -> Python 3.6 - AzureML”
  4. Done with setup. Time to practice notebook.

Feel free to share if you face any issues…

11 Likes

I’m impressed by how easy that all seems…

Does Azure have somelike like spot/preemptible instances?

Yes. Its called “Low priority VMs”.
This offering is currently restricted to services like “Azure Batch” for HPC workloads and “Azure Batch AI” for distributed DL training.

1 Like

Does the jupyter notebook start automatically? Not sure why it is configured at por 8000. When I start jupyter notebook in the command line, it starts at 8888.

For DeepLearning VM, JupyterHub is preconfigured on port 8000. So, yes, it starts automatically with OS startup. The command may have started another instance of jupyter on 8888.

1 Like

Just and FYI to anyone with free Azure credits. GPU instances are NOT eligible. I just tried to provision one and received a “Quota Exceeded” notification.

I think that might be different. You probably just need to request a higher quota

1 Like

This is just a safety measure on their part since GPU and other specialised VMs are higher cost and lower availability so as to not strangle usage for legitimate users by accidental activation. Once you use the support form and request the increase in quota within 1 working day they should enable the additional quota once you mention it is for deep learning tasks. (At least this was my experience with this).

correct. @warrenwong , Please raise a support ticket.
Here are the steps of how to request for quota for N-series VMs - https://docs.microsoft.com/en-us/azure/azure-resource-manager/resource-manager-quota-errors#solution

Thanks all! Created a request.

Thanks a lot . solved my problem

  • Keep an eye on the costs of the VM you create. For example the NC6 VM is less than half the price of the NC6s V2 and is probably satisfactory for most.
  • torchvision-nightly was not available on my VM from the existing channels. I changed to use torchvision, hoping all goes well.
  • Consider configuring the VM to Auto shutdown, which could save some money if you forget to turn it off.

I tried to follow instruction here https://course.fast.ai/start_azure.html but I get an error

microsoft-dsvm:linux-data-science-vm-ubuntu:linuxdsvmubuntu:18.12.01 is not available

I will follow the steps listed here.

One question I have: will I be able to run Notebooks for ML course?

Many thanks!

Chintan

Did you start the VM? It could be that a newly created VM is in the stopped state and thus would be unavailable.

Thanks @Stuart I could not start the VM as the deployment failed. I am currently following the instructions here https://medium.com/@manikantayadunanda/setting-up-deeplearning-machine-and-fast-ai-on-azure-a22eb6bd6429

My question is: will I be able to run ML1 notebooks?

You’ll need to clone the git repo containing the notebooks for that course. But after creating an Ubuntu VM, yes you should be able to run the notebooks.

1 Like

Thanks @Stuart, I will revert back once I have done it.

Update @Stuart I have managed to set up Deep Learning VM and it is running.

I cloned FASTAI repo for ML course and created fastai virtual environment.

I also downloaded bulldozer.zip to ml1/data/ folder using cURL method but I can’t unzip the file when I run the unzip bulldozer.zip. The error is:

    Archive:  bulldozer.zip
  End-of-central-directory signature not found.  Either this file is not
  a zipfile, or it constitutes one disk of a multi-part archive.  In the
  latter case the central directory and zipfile comment will be found on
  the last disk(s) of this archive.
unzip:  cannot find zipfile directory in one of bulldozer.zip or
        bulldozer.zip.zip, and cannot find bulldozer.zip.ZIP, period.

Any pointers for this error?