Thanks! I deleted my instance and started over with a clear instance without --preemptiple which seems to work so far…
Yes! Actually your instructions above (earlier in this thread) are amazing; I followed those and everything is looking great so far Thanks!
I am unable to open this link to access the notebooks:
http://localhost:8080/tree.
I can see the instance running from the console. And am able to access them through the terminal too. Not sure how to fix this.
When I try to connect using this command -
gcloud compute ssh --zone=“us-west1-b” jupyter@“my-fastai-instance” – -L 8080:localhost:8080,
it connects but with an message that says “could not request local forwarding”
Can someone please help?
I dug around a little to find what local forwarding is.
- Landed here: https://manas.tungare.name/blog/ssh-port-forwarding-on-mac-os-x
- Used the ‘lsof -i -P’ command to find existing ‘jobs’ that had a ‘8080’ description (I really don’t know if any of this terminology is accurate).
- Used the 'kill -9 ’ to kill those jobs
- Reconnected using the 8080 port.
And it works now.
Sometimes, I feel like most of what I do is just hoping that something works out.
@jeremy FYI that N2D machines no longer support the west zone nor the p100 GPU. You might want to update your documentation. @rachel FYI too.
After reading some GCP docs I realized that N2D machines are in beta and they are no longer supported in the west zone + they no longer support the p100 GPU.
I got the following setup to work which has a little more memory than the recommended setup but has the same GPU as recommended.
export IMAGE_FAMILY="pytorch-latest-gpu"
export ZONE="us-west1-b"
export INSTANCE_NAME="my-fastai-instance"
export INSTANCE_TYPE="n1-highmem-16" # It seems like the N2D machines are in beta and are no longer available in all zones + not working with p100 anymore.
gcloud compute instances create $INSTANCE_NAME \
--zone=$ZONE \
--image-family=$IMAGE_FAMILY \
--image-project=deeplearning-platform-release \
--maintenance-policy=TERMINATE \
--accelerator="type=nvidia-tesla-p100,count=1" \
--machine-type=$INSTANCE_TYPE \
--boot-disk-size=200GB \
--metadata="install-nvidia-driver=True" \
#--preemptible # Don’t use preemptible as it gave me issues before; described in this thread too.
Hey, I recently moved to GCP from paperspace
I’m getting this error, though i opted for the passphrase, I have no clue what the problem is, I’m getting the key from the compute engine --> metadata --> SSH,
I have also tried getting it from the /Users/<sys_username>/.ssh/google_compute_engine.pub,
Need Help!
@steef I believe the OP wasn’t a wiki, I’ve fixed that. Could you please add these to the top post maybe under a new section?
You could also open a PR for the GCP setup instructions if you’d like.
Thank you!
Ooooh I see that @steef post already answered my question about which machine to use (-: so I deleted the original post. Just for curiosity, since I’m located in Europe, would it make more sense to use us-west1-b
(the recommended zone for GCP) or an European one, e.g., europe-west4-b
?
PS I can add these info to the OP wiki post, if @steef would rather not do that.
@init_27 I don’t want to steal @steef thunder but just for my own understanding: this is the doc one should update with a PR, right?
and these are the PR instructions for this repo:
Note that I’m pointing to v3, because the v4 repository doesn’t include a docs
folder atm.
I believe this is the one
I edited the OP with the instructions by @gautam_e and @steef . Have a look and let me know what you think about it!
Thanks @AndreaPi & @init_27! Not stealing my thunder at all – just glad my findings were useful for others too
Hello all - wondering if anyone more proficient with GCP can help:
I’m getting an error saying that us-west1-b is out of resource and I can’t start my fastai instance. Error message suggests I try a different zone, but any other zones I try that offer an Nvidia Tesla K80 work - I just get another error because I created my instance on the us-west1-b zone.
Is anyone aware of a quick fix? Or do I just have to wait for resources to become available?
Starting instance(s) alex-fastai-instance...failed.
ERROR: (gcloud.compute.instances.start) The zone 'projects/fastai-alex-2020/zones/us-west1-b'
does not have enough resources available to fulfill the request. Try a different zone, or try again later.
When I try a different zone I get this:
$ gcloud compute instances start alex-fastai-instance --zone=us-central1-c
ERROR: (gcloud.compute.instances.start) HTTPError 404:
The resource 'projects/fastai-alex-2020/zones/us-central1-c/instances/alex-fastai-instance' was not found
Many thanks,
Alex
Did you create a preemptible imstance? With non-preemptible instances i had this Problem once but after a second try to start the instance it worked.
I created a non-preemptible instance (and preemptibility is still switched off) so hadn’t expected this. In my case I tried several times over about ten minutes and then spent some time going through Python for data analysis (Wes McKinney) which was definitely worth doing! It worked when I returned a couple of hours later.
Second “problem” I have with my GC account is that I’m actually being charged to use it. I had expected everything to be free on my trial so I’ll have to investigate that. Could be that I’ve used US located VM instance when I’m in Europe. I can’t remember the billing details well enough so it’s on the to-do list to solve at a later date…
I have been trying to access my VM instance for a past couple of days, I’m stuck on the same error after (1)waiting patiently, (2)creating a new instance with a snapshot of my previous instance but with a different region
The zone 'projects/fast-ai-jay/zones/us-west1-b' does not have enough resources available to fulfill the request. Try a different zone, or try again later.
I tried making a new instance again, but i got this error:
Create VM instance "my-fastai-instance-3" and its boot disk "my-fastai-instance-3" 1 hour ago: fast-ai-jay The zone 'projects/fast-ai-jay/zones/australia-southeast1-c' does not have enough resources available to fulfill the request. Try a different zone, or try again later.
Try creating a preemptible instance, this is what worked for me on us-west1-b. I had been on a non-preemptible instance for the duration of this course and during mid week of lesson 8 I started getting the same type of errors you have seen. After many tries, I went back to us-west1-b and decided to add on preemptible which worked! Since last Wednesday, I haven’t had any issues so far in regards to the resources error or being pre-empted.
great, that worked for me !
Thanks @JorgeBriones
Perhaps we should change the top post to reflect the commands for a preemptible instance, since that appears to be the solutions to the problems a lot of people are having? What do you the others think? Like this message if you are in favour of this suggestion.
Ok! At some point in time multiple users including me had the opposite issue, i.e., preemptible would never work, while non-preemptible ones would work right from the start. But GCP is a stochastic environment so it’s entirely possible that now preemptible instances work better. If you want to edit the Wiki, I’m fine with that!