Google Cloud Platform

Hey Nick - did you find a solution for this? I’m running into the same problem now…

@steef yes, a bit of a workaround, really, just used the parameters from @futbol10 above. Worked for me for training the model in the first lesson.

export IMAGE_FAMILY="pytorch-latest-gpu"
export ZONE="us-west1-b"
export INSTANCE_NAME="fastai-instance"
export INSTANCE_TYPE="n1-highmem-4" 

gcloud compute instances create $INSTANCE_NAME \
        --zone=$ZONE \
        --image-family=$IMAGE_FAMILY \
        --image-project=deeplearning-platform-release \
        --maintenance-policy=TERMINATE \
        --accelerator="type=nvidia-tesla-k80,count=1" \
        --machine-type=$INSTANCE_TYPE \
        --boot-disk-size=200GB \
        --metadata="install-nvidia-driver=True" \
        --preemptible

export ZONE="us-west1-b"
export INSTANCE_NAME="fastai-instance"
gcloud compute ssh --zone=$ZONE jupyter@$INSTANCE_NAME -- -L 8080:localhost:8080
2 Likes

@salmaniai i had this same error. see my post above for parameters that seem to work from @futbol10

@salmaniai @steef if you can’t set it up from the CLI you can also set up vm instances with the through the google cloud console, see below for more info.


https://cloud.google.com/ai-platform/notebooks/docs/create-new

Same here – I also found a workaround.

After reading some GCP docs I realized that N2D machines are in beta and they are no longer supported in the west zone + they no longer support the p100 GPU.

I got the following setup to work which has a little more memory than the recommended setup but has the same GPU as recommended.

@jeremy FYI that N2D machines no longer support the west zone nor the p100 GPU. You might want to update your documentation. @rachel FYI too.

export IMAGE_FAMILY="pytorch-latest-gpu" 
export ZONE="us-west1-b"
export INSTANCE_NAME="my-fastai-instance"
export INSTANCE_TYPE="n1-highmem-16" # It seems like the N2D machines are in beta and are no longer available in all zones + not working with p100 anymore.

gcloud compute instances create $INSTANCE_NAME \
        --zone=$ZONE \
        --image-family=$IMAGE_FAMILY \
        --image-project=deeplearning-platform-release \
        --maintenance-policy=TERMINATE \
        --accelerator="type=nvidia-tesla-p100,count=1" \
        --machine-type=$INSTANCE_TYPE \
        --boot-disk-size=200GB \
        --metadata="install-nvidia-driver=True" \
        #--preemptible # Don’t use preemptible as it gave me issues before

@salmaniai I ran into the same existence issue that you described and that issue is also resolved with this solution.

4 Likes

Thank you @vijaysai for the reply!

Hi everyone!
i was using paperspace earlier, had some issues with space management in paperspace, decide to switch to GCP.


Got this error today, need help!
Thanks

I think you made a mistake copying INSTANCE_TYPE
try
export INSTANCE_TYPE=“n2d-highmem-8”
instead. It should work fine. Also i suggest not using us-west1-b as your zone. It is quite a busy server, and your instance frequently gets preempted . I suggest ‘europe-west1-b’ or something else.
All the best

Hello :slightly_smiling_face:

I need some help related to setup. I tried to increase my quota of GPUs to 1. I have followed all the steps as mentioned in the server setup for google cloud & made a request to increase quota. I got a confirmation email saying that the request to increase quota was successfully received. But within few seconds, I get another email quoting:

Unfortunately, we are unable to grant you additional quota at this time. If
this is a new project please wait 48h until you resubmit the request or
until your Billing account has additional history.

Your Sales Rep is a good Escalation Path for these requests, and we highly
recommend you to reach out to them.

My project is new and I have waited for weeks, yet the quota didn’t change. I have tried a lot of times, yet I get the same email within few seconds after the request confirmation email every single time. Can someone help me resolve the issue?

Note: I have upgraded my account, my project is linked to a billing account.

Any help is appreciated :slightly_smiling_face:
Thanks!

Hi Palaash, Are you sure n2d-highmem-8 is compatible with a GPU. I believe things have changed in GCP.

Have a look at the screenshot Caution note from GCP documentation.

I am at the exact problem. Any other help is appreciated.

Dear All, I think there are n2d-highmem-8 in some zone, and there are p100 GPU in other zones. I think there were machine n2d-highmem-8 with P100 GPU in US-West-??somewhere before. Thus, I would like to see what other budget options of machine/GPU combinations available. Anyone can share their success story and info about the cost. BTW, I am using Colab for fastai2 as a free option.
Thanks, @duerrseb for his post: FastAI2 notebooks in Kaggle which were updated March 22.
https://github.com/seduerr91/fastAI_v4/blob/master/fastai2%20on%20colab.md

Hi @Vineeth
I’m not exactly aware of the exact issue, but it may be possible that you submitted more than one requests for quota allocation. In that case, you can go to your GCP console and check how many quotas you have. You should have 1 quota (most probably, or atleast)
If you dont have any quotas, try resubmitting a request, and you should be good to go.
Cheers, Stay Safe

Hi @satyartha
Try removing the Machine type option, and see if it works. You can refer to this link

https://cloud.google.com/ai-platform/deep-learning-vm/docs/pytorch_start_instance#with_one_or_more_gpus_2

You can follow the rest as is mentioned on fastai’s installation guide.

But I think you won’t be able to do any computation on CPUs because of that. That’s okay, we anyways do most of our computation on GPUs. Like, for example, in the Fastai course v3 (part 1), there’s only one place in one of the notebooks, where inference was done on CPUs, as far as i remember. You can easily carry out that process on GPUs instead. Shouldn’t create a problem.

Cheers, Stay Safe

1 Like

Thank you @PalaashAgrawal for the suggestion, there were some compatibility issues between n2-highmem-8 with the mentioned GPU.

I tried INSTANCE_TYPE=“n1-highmem-16” instead and it worked for me.

Also pleaase note that removing preemptible as parameter will increase your costs to about $1.29 an hour.

1 Like

Yeah, I use budget n1-highmem-4 with nvidia-tesla-t4 and the combination works. There is a PR sent for updating (reverting) documentation but hasn’t been merged yet.

fwiw, I also tried Gradient free tier type before trying gcp, and found that it is a bit faster than my gcp instance setup above. (tested with lesson-1 pet example with bs=64. 22 sec v.s. 48 sec for an epoch.)

1 Like

Hello good people.

I just started the v3 course with a Google Cloud instance. It has been a mixed bag, but lately I am getting cut off from my session very often. So I wanted to know if anyone can tell me how I can transfer my instance to a non pre-emptive instance in the easiest way possible.
I know that I cannot just edit my existing instance and set it to non pre-emptive (which really annoys me) and I would like to keep my storage so I don’t lose all my progress (and don’t have to pay for twice as much storage).

Please if anyone can help me that would be very helpful. I tried searching the forums and the internet, but I could not really find anything and using the web interface of Google Cloud is a frustrating experience.

Nevermind, I was so frustrated after I tried to create an image from the instance which was still set to preemptive and no way for me to change this, that I deleted the whole instance and set up a new one. This probably means I have to pay for that image plus the new storage I added but at least I have something not-preemptive.
Man I remember that aws was much more transparent e.g. about what costs money and google cloud is super intransparent.
Anyways, I advise anybody to not do pre-emptive instances at this point as they are not up for longer than 10 minutes (if they start).

I’ve been auto-rejected every time for the last couple months. Calling their support wasn’t helpful, they would forward me to sales, who forwarded me back to support who forwarded me back to sales. Verified, months-old account. Even tried adding $50 to my account as recommended in some posts I found on reddit.

I still haven’t been able to use GCP…