Notes on Part 0 - Google Compute Platform

CharlesMerriam · May 28, 2020, 4:28am

It took me a few frustrating hours to get going, so let me share the little roadblocks I ran into (as of May, 2020):

Install gcloud tools.
- These tools require a version of Python 2. Remember that? It’s a decade or more old. The installer claims to also use Python 3 as well: it doesn’t.
- On OSX, brew no longer installs Python 2 correctly. You know there is a problem if you get errors about ‘md5’ missing.
- I worked around by using creating a conda environment with Python 2 just to install gcloud. That is, conda create -n py2 python=2.7; source activate py2
GPU_ALL_REGIONS limit 0.0 when creating instance
*You actually need to ask Google if they will permit you to pay them money. This is full customer service request.
- You will need to ‘upgrade your account’ when asked; I have no idea what that did,
- Go to the ‘Quotas’ page, https://console.cloud.google.com/iam-admin/quotas. It’s about 1,400 options without a search bar. Click on the “Limit” column header to sort by ascending limits. Filter to Global locations: click Locations, None, Global. You should find GPU_ALL_REGIONS
Click on the checkbox next to GPU_ALL_REGIONS, and the “Edit Quotas” button near the top. This brings up a sidebar. Type ‘fastai’ in the request reason. Submit.
Wait
Wait
You should get an email (not console notification) that your request is approved.
Working with gcloud.
- Sometimes, ssh gets confused. I found removing my keys helped: rm ~/.ssh/google_*
- SSH fails with the bland 4033: u'not authorized' for many reasons, including your instance not running.
- I use shortcuts for common gcloud commands, with g as my hint command. Below is the section from my .bash_profile.

    # Google Compute Platform Alias)
    export G_INSTANCE_NAME="fastai-1"
    export G_PROJECT_TAG="neural-aquifer-29999"
    export G_BASE="~/p/fastai"
    export G_IMAGE_FAMILY="pytorch-latest-gpu"
    export G_ZONE="us-west1-b"
    export G_INSTANCE_TYPE="n1-highmem-8"
    alias gg="cd $G_BASE; source google_cli_profile"
    alias gdown="gcloud compute instances stop $G_INSTANCE_NAME"
    alias gup="gcloud compute instances start $G_INSTANCE_NAME"
    alias ginst="gcloud compute instances list"
    alias gcon="open https://console.cloud.google.com/compute/instances?project=$G_PROJECT_TAG&instancessize=50"
    alias gai1="open https://course.fast.ai/videos/?lesson=1"
    alias gjup="open http://localhost:8080"
    alias gssh="gcloud compute ssh --zone=$G_ZONE jupyter@$G_INSTANCE_NAME -- -L 8080:localhost:8080"
    alias g="echo \"
        Variables G_INSTANCE_NAME, G_PROJECT_TAG, G_BASE, G_IMAGE_FAMILY, G_ZONE, G_INSTANCE_TYPE
        gg = go to directory and set gcloud path
        gdown = turn instance off
        gup = turn instance on
        ginst = list GCP instances
        gcon = open GCP console
        gai1 = open FastAI lesson 1
        gssh = ssh to the image
        gjup = open jupyter in localhost

        gcloud compute instances create $G_INSTANCE_NAME
    		--zone=$G_ZONE
    		--image-family=$G_IMAGE_FAMILY
    		--image-project=deeplearning-platform-release
    		--maintenance-policy=TERMINATE
    		--accelerator=quote type=nvidia-tesla-p100,count=1 quote
    		--machine-type=$G_INSTANCE_TYPE
    		--boot-disk-size=200GB
    		--metadata=quote install-nvidia-driver=True quote
        	# add --preemptible for cheaper and 24hr delete
        \""

Good luck! Ask if you have problems.