It took me a few frustrating hours to get going, so let me share the little roadblocks I ran into (as of May, 2020):
- Install
gcloud
tools.- These tools require a version of Python 2. Remember that? It’s a decade or more old. The installer claims to also use Python 3 as well: it doesn’t.
- On OSX, brew no longer installs Python 2 correctly. You know there is a problem if you get errors about ‘md5’ missing.
- I worked around by using creating a
conda
environment with Python 2 just to install gcloud. That is,conda create -n py2 python=2.7; source activate py2
-
GPU_ALL_REGIONS limit 0.0
when creating instance
*You actually need to ask Google if they will permit you to pay them money. This is full customer service request.- You will need to ‘upgrade your account’ when asked; I have no idea what that did,
- Go to the ‘Quotas’ page, https://console.cloud.google.com/iam-admin/quotas. It’s about 1,400 options without a search bar. Click on the “Limit” column header to sort by ascending limits. Filter to Global locations: click Locations, None, Global. You should find GPU_ALL_REGIONS
- Click on the checkbox next to GPU_ALL_REGIONS, and the “Edit Quotas” button near the top. This brings up a sidebar. Type ‘fastai’ in the request reason. Submit.
- Wait
- Wait
- You should get an email (not console notification) that your request is approved.
- Working with gcloud.
- Sometimes, ssh gets confused. I found removing my keys helped:
rm ~/.ssh/google_*
- SSH fails with the bland
4033: u'not authorized'
for many reasons, including your instance not running. - I use shortcuts for common gcloud commands, with
g
as my hint command. Below is the section from my .bash_profile.
- Sometimes, ssh gets confused. I found removing my keys helped:
# Google Compute Platform Alias)
export G_INSTANCE_NAME="fastai-1"
export G_PROJECT_TAG="neural-aquifer-29999"
export G_BASE="~/p/fastai"
export G_IMAGE_FAMILY="pytorch-latest-gpu"
export G_ZONE="us-west1-b"
export G_INSTANCE_TYPE="n1-highmem-8"
alias gg="cd $G_BASE; source google_cli_profile"
alias gdown="gcloud compute instances stop $G_INSTANCE_NAME"
alias gup="gcloud compute instances start $G_INSTANCE_NAME"
alias ginst="gcloud compute instances list"
alias gcon="open https://console.cloud.google.com/compute/instances?project=$G_PROJECT_TAG&instancessize=50"
alias gai1="open https://course.fast.ai/videos/?lesson=1"
alias gjup="open http://localhost:8080"
alias gssh="gcloud compute ssh --zone=$G_ZONE jupyter@$G_INSTANCE_NAME -- -L 8080:localhost:8080"
alias g="echo \"
Variables G_INSTANCE_NAME, G_PROJECT_TAG, G_BASE, G_IMAGE_FAMILY, G_ZONE, G_INSTANCE_TYPE
gg = go to directory and set gcloud path
gdown = turn instance off
gup = turn instance on
ginst = list GCP instances
gcon = open GCP console
gai1 = open FastAI lesson 1
gssh = ssh to the image
gjup = open jupyter in localhost
gcloud compute instances create $G_INSTANCE_NAME
--zone=$G_ZONE
--image-family=$G_IMAGE_FAMILY
--image-project=deeplearning-platform-release
--maintenance-policy=TERMINATE
--accelerator=quote type=nvidia-tesla-p100,count=1 quote
--machine-type=$G_INSTANCE_TYPE
--boot-disk-size=200GB
--metadata=quote install-nvidia-driver=True quote
# add --preemptible for cheaper and 24hr delete
\""
Good luck! Ask if you have problems.