Platform: Google Cloud Platform (GCP)

jeremy · March 17, 2020, 2:34pm

(NB: this is incomplete. It’s a wiki post - please contribute!)

Basic steps

Step 1: Creating your account
When running gcloud init, you may get the following warning:
WARNING: Listing available projects failed: There was a problem refreshing your current auth tokens: invalid_grant: Bad Request
in this case:
- interrupt the execution of gcloud init by clicking Ctrl +C
- run gcloud auth login: a browser session will open
- run again gcloud init: now you won’t get the warning anymore
Step 2: Install Google CLI

As explained here, the first command of Step 3 must be modified as follows:

 export IMAGE_FAMILY="pytorch-latest-gpu" 
 export ZONE="us-west1-b"
 export INSTANCE_NAME="my-fastai-instance"
 export INSTANCE_TYPE="n1-highmem-16" # It seems like the N2D machines are in beta and are no longer available in all zones + not working with p100 anymore.

 gcloud compute instances create $INSTANCE_NAME \
     --zone=$ZONE \
     --image-family=$IMAGE_FAMILY \
     --image-project=deeplearning-platform-release \
     --maintenance-policy=TERMINATE \
     --accelerator="type=nvidia-tesla-p100,count=1" \
     --machine-type=$INSTANCE_TYPE \
     --boot-disk-size=200GB \
     --metadata="install-nvidia-driver=True"

Follow the rest of Step 3 as is, but note that since the instance is no more preemptible, it doesn’t stop automatically after 24 hours. Thus, you must stop it every time you’re not using it (e.g.,if you’re not training a network). See Step 5 to learn how to stop an instance.

As explained here, Step 4 must be substituted by the following commands:
- git clone https://github.com/fastai/course-v4
- git clone https://github.com/fastai/fastbook.git
  Note: A git pull within the course-v4 folder or the ‘fastbook’ folder keeps them up to date.
- pip install fastai2 in the command line and its dependencies with:
- pip install -r requirements.txt after navigating to the course-v4 folder.
Step 5: Stop an instance

FAQ

gcloud - CLI commands

# start instance
gcloud compute instances start <instance-name>

# stop instance
gcloud compute instances stop <instance-name>

# check status
gcloud compute instances describe <instance-name> | grep status

sylvaint · March 17, 2020, 4:07pm

Please note that the initial 300$ credit can NOT be applied towards a GPU…
Also they refused to upgrade my account with a GPU because I did not have enough of a billing history.
I ended up setting up an AWS instance in 15 minutes.

miwojc · March 17, 2020, 4:25pm

is this new at GCP? It used to be possible to use 300 credit on GPU instances.

jeremy · March 17, 2020, 4:26pm

Lots of folks have done that successfully - might be something specific to your account.

sylvaint · March 17, 2020, 4:34pm

Maybe specific to my account, not sure why that would be.

https://cloud.google.com/free/docs/gcp-free-tier#limitations
“You can’t add GPUs to your VM instances.”

Here is their response:

Hello,

We have received your quota request for fastaiserver. Unfortunately, we are unable to grant your quota increase due to insufficient service usage history.

=== Quota Requested ===
±---------------±-----------------+
| Request | GPUS_ALL_REGIONS |
±---------------±-----------------+
| Region: GLOBAL | 1 |
±---------------±-----------------+

Our suggestion would be for you to make usage of your current quota and if further resources are needed then please go ahead and contact[1] our Sales Team in order to discuss further options on higher quota eligibility.

Regarding your concerns, for now since this is a new project you are not eligible for increase yet so you have to wait for 48 hours until your Billing account has additional history. You cannot use the free 300$ credit for GPUs, it is only limited for some resources. You may visit: https://cloud.google.com/free/docs/gcp-free-tier#limitations to get more information on where you can use the $300 free credit.

Thank you for your patience and understanding.

Best Regards,

King on behalf of the Google Compute Team

jeremy · March 17, 2020, 4:38pm

Ah OK @sylvaint I think I see the issue. You need to upgrade your account - which means you need to provide a credit card. At that point, you still have the $300 credits, and should be able to use them for GPUs (if you’re able to get a quota request approved).

pinaki · March 17, 2020, 4:42pm

Hi @jeremy will AWS / Sagemaker still work for this year’s course ? I was following the instructions here https://course.fast.ai/start_sagemaker.html

jeremy · March 17, 2020, 4:44pm

@pinaki should be fine - just pip install fastai2.

sylvaint · March 17, 2020, 4:45pm

Account is upgraded with a valid Credit card. Maybe this issue is specific to my account.
However, they specifically stated that the 300$ credit could not be used towards a GPU.

jeremy · March 17, 2020, 4:48pm

I’m fairly sure that’s wrong.

miwojc · March 17, 2020, 4:49pm

from GCP site:

so indeed free tier needs to be upgraded to paid account in order to use GPU!

You must upgrade to a paid account to use Google Cloud after the free trial ends. To take advantage of the features of a paid account (using GPUs, for example), you can upgrade before the trial ends. When you upgrade, the following conditions apply:

Any remaining, unexpired free trial credit remains in your account.
Your credit card on file is charged for resources you use in excess of what’s covered by any remaining credit.

sylvaint · March 17, 2020, 4:58pm

Let me start over with a new account and I will report back.

vijaysai · March 17, 2020, 5:59pm

My experience on this matter is that GPU’s are allowed for free tier(credit card added). However, when you’re building a preemptible instance, your VM just does not start or it starts and by the time you SSH into it, it just stops. This was not my experience last year. I’ve not tried non-premptible instance yet. I’ll update once I try it.

It seems preemptible instances are not allowed on the free tier. Thanks to @Harvey for the info. Leaving a link from another thread.

sylvaint · March 17, 2020, 6:20pm

Yes @vijaysai, I was just about to report the exact same thing !
I will try a non pre-emptible instance

sylvaint · March 17, 2020, 6:32pm

Works with non preemptible instance
Just comment out --preemptible from the create instance command:

export IMAGE_FAMILY="pytorch-latest-gpu" # or "pytorch-latest-cpu" for non-GPU instances
export ZONE="us-west1-b"
export INSTANCE_NAME="my-fastai-instance"
export INSTANCE_TYPE="n1-highmem-8" # budget: "n1-highmem-4"

# budget: 'type=nvidia-tesla-k80,count=1'
gcloud compute instances create $INSTANCE_NAME \
        --zone=$ZONE \
        --image-family=$IMAGE_FAMILY \
        --image-project=deeplearning-platform-release \
        --maintenance-policy=TERMINATE \
        --accelerator="type=nvidia-tesla-p100,count=1" \
        --machine-type=$INSTANCE_TYPE \
        --boot-disk-size=200GB \
        --metadata="install-nvidia-driver=True" \
#        --preemptible

sylvaint · March 17, 2020, 7:41pm

Make sure you STOP your instance especially now that it seems only non preemptible instances work. Your are charged as long as the instance is running.
I find using the GOOGLE CLI is easier and safer for me than the GCP web console.
You can setup aliases in bash.

gcloud compute instances list

NAME ZONE MACHINE_TYPE PREEMPTIBLE INTERNAL_IP EXTERNAL_IP STATUS
fastai us-west1-b n1-highmem-8 1.1.1.1 TERMINATED

gcloud compute instances start fastai

No zone specified. Using zone [us-west1-b] for instance: [fastai].
Starting instance(s) fastai…done.
Updated [https://compute.googleapis.com/compute/v1/projects/eloquent-hold-271417/zones/us-west1-b/instances/fastai].
Instance internal IP is xxx
Instance external IP is xxx

gcloud compute instances list
NAME ZONE MACHINE_TYPE PREEMPTIBLE INTERNAL_IP EXTERNAL_IP STATUS
fastai us-west1-b n1-highmem-8 ### ### RUNNING

gcloud compute instances stop fastai

No zone specified. Using zone [us-west1-b] for instance: [fastai].
Stopping instance(s) fastai…done.
Updated [https://compute.googleapis.com/compute/v1/projects/eloquent-hold-271417/zones/us-west1-b/instances/fastai].

gcloud compute instances list
NAME ZONE MACHINE_TYPE PREEMPTIBLE INTERNAL_IP EXTERNAL_IP STATUS
fastai us-west1-b n1-highmem-8 ### TERMINATED

Always make sure the list command returns a TERMINATED status when you are done.

go_go_gadget · March 18, 2020, 12:20am

I’m having trouble getting off the ground with GCP on MacOS High Sierra, despite having successfully used it on several machines previously. When I attempt to run gcloud init, I get the following error: zsh: command not found: gcloud. Upon Googling, I found this page and I’ve tried adding the following to my .zshrc file:

# The next line updates PATH for the Google Cloud SDK.
source /Users/dwchiang/google-cloud-sdk/path.zsh.inc

# The next line enables zsh completion for gcloud.
source /Users/dwchiang/google-cloud-sdk/completion.zsh.inc

And the following to my path.zsh.inc file:

script_link="$( readlink "$0" )" || script_link="$0"
apparent_sdk_dir="${script_link%/*}"
if [ "$apparent_sdk_dir" == "$script_link" ]; then
  apparent_sdk_dir=.
fi
sdk_dir="$( cd -P "$apparent_sdk_dir" && pwd -P )"
bin_path="$sdk_dir/bin"
export PATH=$bin_path:$PATH

That didn’t work, so I tried another suggestion on the same page, which was to add gcloud to the list of plugins on my ~/.zshrc file, but this also hasn’t worked, and I’m still getting the same error.

I’d greatly appreciate any pointers!

harish3110 · March 18, 2020, 1:30am

I believe you haven’t installed the gcloud CLI for Mac correctly.

Try:
curl https://sdk.cloud.google.com | bash
exec -l $SHELL

And then try initializing it. Hopefully, that works!

go_go_gadget · March 18, 2020, 1:34am

Thanks! That looks like the same code in the tutorial Jeremy linked, which I used originally. I also tried copying and pasting from your post, and still have the same error.

harish3110 · March 18, 2020, 4:28am

Hi,

I need some help in trying to SCP some of the notebooks I have created while following along the fastai v2 docs on GCP.

It looks like the setup guide for GCP puts me on the jupyter instance where I have access to the tutorials and the gcloud ssh code puts me in a different place. So when I try to use the provided SCP command, I get an error saying the folder/file I’m trying to transfer doesn’t exist…

Thanks!