How to create an Azure VM for Data Science and fastai

I just posted a blog using fast_tempate (I must switch over to fastpages!) that does a step-by-step walk through of creating a GPU based VM that you could use for the fastai course.
Let me know if any of it is not clear.
The one I show uses spot prices, which means availability is not guaranteed and you might get shut down if a full price customer needs it (but your data is persisted) - but starting at less than $0.12 an hour this is certainly an affordable GPU option.

5 Likes

Maybe I am missing something but I couldn’t see anywhere in the blog for specifiying how many and of which version of gpu’s you want to attach to the vps. I presume it defaults to 1xk80?
I find azure (and gcp) really difficult to work out if you want to say provision say a 4xv100 machine for a few hrs. With AWS its straightforward but there are several stages limits you need to get approval for which takes days.

1 Like

Yes, @adrian, the NC6 is a specific configuration. If you want a different config then you could filter on GPU to see the more expensive/powerful options. My aim was really the minimum cost and config that would work for the course - but I agree you might want more power for projects you might start based on the course. I’d certainly welcome others chipping in with suggested ways of using the cloud (all vendors). I can see you might just want a non-GPU VM with some storage (or just storage?) and then spin up one or more GPU based VMs to do the work when you need them… (Disclaimer - I work for Microsoft but Azure and AI/ML isn’t my day job, and I’m not a data scientist - I am an escalation engineer supporting Microsoft Project, Microsoft Planner and the cloud based project/task management solutions, and data hobbyist.)

2 Likes

Thanks Brian, if you know of a way to cancel your account with Azure please add that to your writeup. I created an account a few years back and I was required to give my credit card. After the trial period was over I tried canceling my account and there didn’t seem to be a way to get rid of my CC from their records. I hope they’ve changed their interface. I was trying to get it off and completely cancel it because at the time Azure policy was that if someone takes over your machine and runs up credit beyond your free credits you’re on the hook for it and MS has your CC.

I tried multiple ways and got the run around from Azure support and finally gave up and that credit card expired after a year or so. (this was circa 2014-2015)

I would say to everyone using this service to be aware of any complications that may arise before setting up machines that may run up high bills if they leave them running or if they’re compromised. I haven’t touched an Azure machine since this happened and don’t intend to as I have an active AWS account and feel more comfortable using that.

YMMV.

1 Like

Yes, Mike - I’ll add a note to the blog, and I’m sorry for the bad experience you had. I’ll also add a note out that you need to STOP your VM so that is shows deallocated, and not just sign off or shut down - in order not to be charged for the time. @AzureSupport on Twitter is certainly one way to contact Azure support - of go through the various options at support.microsoft.com (call, mail, chat) and ask for Commerce. https://aka.ms/billingfaq also has resources to help understand and control costs.

1 Like

Thanks Brian! This was 5 years ago and I’m sure all those support kinks have been taken out but I just wanted to put it out there. (I did go through all channels even spoke to someone on the phone there were a lot of loops in the process and I kept getting redirected back to the FAQ/Admin page and from there to the help line 800 number and from there to the faq/admin and from there to the help line… at some point I didn’t think risk was high enough for me to justify spending anymore time as my machine was shutdown and my CC was about to expire … :slight_smile:

1 Like

@adrian you could choose “NC24s v3” instead “NC6” for:
24 x Intel Xeon CPU
448 GiB RAM
4 X nVIDIA V100 GPU
But, of course, it’s very expensive :wink:

For this course NV6 (spot price saving ~86%) or NV6_Promo (Pay as you go) should be a good starting point :wink:

1 Like

Great post!

1 Like

Thanks, I hadn’t looked hard enough at the pricing page https://azure.microsoft.com/en-au/pricing/details/virtual-machines/windows/
I have found that it pays to be prepared before you spin up an expensive VM (eg in last days of a comp when needed to train models rapidly) - upload your data (if pre-generated training data locally) to a cloud storage bucket on the VM provider whereby the VM can access the data with fast connection so that you can get the data onto the VM quickly once you spin it up (or ability to attach a drive with pre-loaded data)

1 Like

I installed fastai v2 in azure following the very useful blog of @brismith. I created a VM NC6 using ubuntu 18.4. It works well. I would like to know where in the VM untar_data saves the downloaded files and how to access them. Logging into the VM by ssh I cannot findd them.

I found it easier to just tweak the file from github and use the Azure Command Line Interface.

  1. Just paste this to the Cloud Shell:

wget https://gist.githubusercontent.com/mu-ct/f7ef26de6a5372704cfce12f6fa54f0b/raw/0be27de453cad76fb8a3b176eae6d5ece098ad47/fastazure.sh
bash fastazure.sh

It creates a DSVM on Ubuntu 18.04 in Standard_NC6_Promo size (by default in North Europe but you can change that in the Cloud Shell) and downloads all the fast.ai files in a Jupyter Notebook.

  1. In your browser (Brave didn’t work for me) go to https://(public IP of your VM):8000
  2. Your browser will not trust this site, but click “visit the site” or something similar and JupyterHub will apear
  3. Login is “fastuser”, password is the one you set in the Cloud Shell
  4. You are done, but make sure that the VM is Standard_NC6_Promo and not just Standard_NC6. In the main page of the Azure portal go to “virtual machines”, select the one you just created. If the size is not Standard_NC6_Promo, go to “Size” on the left and change it. Remember to stop your machine in the portal after you are done using it!!

If you want to change the size, type or anything else about the VM you want to create:

have a look here: docs.microsoft.com/en-US/cli/azure/vm?view=azure-cli-latest#az_vm_create

LOCATION
when you run the script in Cloud Shell, it will ask you for the location, choose a place close to you that is also cheap, look here (azure.microsoft.com/en-gb/pricing/details/virtual-machines/linux/) for pricing, the names are the same as the “Regions” just without spaces and lower case, so for example “northeurope” or “westus2”

TYPE

  1. in the Azure portal go to “Create a Resource”, find the VM that interests you in the Marketplace, click on “Usage Information + Support”
  2. edit this file: https://github.com/Azure/DataScienceVM/blob/main/Samples/fastai2/fastai2onAzureSpotDSVM.sh
  3. the line you are looking for is “az vm create…”
  4. take the IDs and put them in the “az vm create” line like this:

az vm create --name $vmname -g $vmname --publisher (Publisher ID) --offer (Offer ID) --version latest --priority Regular --size Standard_NC6_Promo --storage-sku StandardSSD_LRS --admin-user fastuser --admin-password $password

  1. for a spot instance change “–priority Regular” to “–priority Low” and add “–eviction-policy Deallocate” or “–eviction-policy Delete”
  2. save it, click “Raw”, copy the address and pass it in the Cloud Shell like this:

wget (address of your github file)
bash (name of your file)

SIZE

  • if you don’t know what sizes are available in your location write in the Cloud Shell:

az vm list-sizes -l (your location, for example “westus2”) -o table

  1. in the same file as above, in the “az vm create” line change the “–size” to whatever you want, for example “–size Standard_D2_v2_Promo” and follow the steps thereafter

If this doesn’t work for you:

  • make sure your subscription has room for the vCPUs you are creating. go to “Subscriptions”, choose yours, on the left choose “Usage + quotas”. if you don’t have enough of the “Total Regional vCPUs” or “Standard NC Promo Family vCPUs” etc. you can upgrade your subscription or choose a smaller size VM
  • make sure the VM you are trying to create is available in your location
  • make sure that in your VM “Activity log” the " Create or Update Virtual Machine Extension" was successful
  • remember to put “https://” before the IP and :8000
  • try using a different browser (Brave didn’t work on my computer)

Hope this saves you a bit of time. Enjoy