Platform: GCP ✅


(Paul M) #396

Could anyone get the lesson3 NLP notebook to run with the us-west2-c instance? I lowered the batchsize to 30 but still got cuda out of memory errors more than half an hour into the training.


(Fernando Melo) #397

SATURDAY NIGHT 9:30 PM in Miami local time !
I have now 3 instances created on different zones: us-west2-c, us-central1-c, us-east1-c.
I´m trying to start my instances and guess what???
Google Cloud “does not have enough resources available to fulfill the request”.
No "resources on west, centra and east !!
That´s fantastic!
I already upgraded my account, my instances are NOT preemptible …


(Arunoda Susiripala) #398

Interesting fact:

  • My personal account (which is upgraded but has $300 credits) has this resource availability issue
  • But one my friend’s account doesn’t have this issue (His one is a production account which is pretty old)

(Ad Postma) #399

Same problem here for more than 24 h. now with instance in us-west2. I was able to create a second instance in zone europe-west4. This worked for a few hours but now gives the same error when trying to connect.


(Ad Postma) #400

I can now start my instance in us-west2!
Does anyone know how to work from 2 instances (in differend regions) on the same notebooks and datasets?


(Stephen Mak) #401

I’m struggling to open my Jupyter Notebook using GCP. I’ve followed the GCP Tutorial here on the forums, and everything seems to have been set up ok. In Step 3, because the US servers didn’t have any P4’s available, I had to switch to Europe-West4-c, and I’ve changed all arguments accordingly.

I then go to the “Returning to GCP” thread and follow the steps. I get to

" gcloud compute ssh --zone=ZONE jupyter@INSTANCE_NAME – -L 8080:localhost:8080"

replacing zone with “europe-west4-c” and instance name with “my-fastai-instance”. I have then created my SSH key and typed in my pass phrase, and things seem to be going ok, being greeted with a “Welcome to Google Deep Learning VM” etc.

I now open a new tab in Chrome and type in “http://localhost:8080/tree” where I am then greeted by the sad error below (where I expect Jupyter Notebook to open up):

image

Anyone know what is causing the error? As I am on Windows 8, I’ve executed all the commands in GCP’s Power Shell (or whatever it’s called) and not used CygWin, but I don’t think that’s the problem (I run into other errors when using CygWin). Any suggestions would be much welcomed! I’ve been stuck on this part for a couple of hours now! :frowning:


#402

I encountered the same issue, so maybe this will work for you, too: type jupyter notebook at the command line. When I use GCP, the notebook does not seem to autostart. I also just used GCP’s Cloud Shell.


#403

For those struggling to find available resources, this workaround from Stackoverflow seems to work for me:

To “fix” this, just create VM without external IP , then after VM success created edit and add a external IP.


(Iyappan Subramanian) #404

I tried creating an instance with “–no-address”, still getting the same error, ‘doesn’t have enough resources…’

I tried different time zones in us-west1, us-west2, us-central1 etc.

Stuck for now!!!


(Stephen Mak) #405

Yeah I tried that too but unfortunately I get this:

jupyter@my-fastai-instance:~$ jupyter notebook
[I 22:25:10.133 NotebookApp] [nb_conda_kernels] enabled, 0 kernels found
[I 22:25:10.138 NotebookApp] Writing notebook server cookie secret to /run/user/1000/jupyter/notebook_cookie_secret
[W 22:25:10.281 NotebookApp] WARNING: The notebook server is listening on all IP addresses and not using encryption. This is not recommended.
[W 22:25:10.281 NotebookApp] WARNING: The notebook server is listening on all IP addresses and not using authentication. This is highly insecure and not recommended.
[I 22:25:10.281 NotebookApp] The port 8080 is already in use, trying another port.
jupyter_http_over_ws extension initialized. Listening on /http_over_websocket
[I 22:25:10.308 NotebookApp] JupyterLab extension loaded from /opt/anaconda3/lib/python3.7/site-packages/jupyterlab
[I 22:25:10.308 NotebookApp] JupyterLab application directory is /opt/anaconda3/share/jupyter/lab
[I 22:25:10.437 NotebookApp] [nb_conda] enabled
[I 22:25:10.470 NotebookApp] ✓ nbpresent HTML export ENABLED
[W 22:25:10.471 NotebookApp] ✗ nbpresent PDF export DISABLED: No module named ‘nbbrowserpdf’
[I 22:25:10.471 NotebookApp] Serving notebooks from local directory: /home/jupyter
[I 22:25:10.471 NotebookApp] The Jupyter Notebook is running at:
[I 22:25:10.471 NotebookApp] http://(my-fastai-instance or XXX.0.0.1):8081/
[I 22:25:10.471 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).

Where I’ve replaced 3 digits of the IP address in this comment as I’m not sure if I should do so if I’m posting it publicly (better safe than sorry!).

I then tried opening:

  1. http://my-fastai-instance:8081/
  2. http://XXX.0.0.1:8081
  3. http://localhost:8081/tree

in separate tabs but to no avail I’m afraid. Any suggestions? :frowning:


#406

Try :8080 instead of :8081. Mine showed :8081, but I could only connect on :8080. I don’t know why it would work, but it did.


#407

I didn’t try it, but somebody already reported back that it doesn’t work.
“Not working for me. Further, I have an existing instance but can’t start it anymore. I get the same error as in the creation.”
:rage:


(Arunoda Susiripala) #408

If you still have issues with GCP.
Try this: https://github.com/arunoda/fastai-shell
This is not the recommended method. But it’ll work for sure.


#409

I just managed to start up a new vm in us-west2-b,
still no luck in australia-southeast1-b


(Bilal) #410

Is there an image (image_family) on GCP for fastai 0.7?


(Pradeep Vasamsetti) #411

Check firewall settings in GCP and if you don’t find one, try creating a new rule which allows tcp:8080


(Maria) #412

Wouldn’t using git be easier? Like do your code locally and then git pull on the machine.

Here is feature request to automatically create vm in other zone when resources in the current one are not available. Star it, so maybe google will consider implementing: https://issuetracker.google.com/issues/77734062


(John Lambert) #413

Based on my understanding of what I’ve read here and elsewhere, it’s not possible to make a preemptible instance non-preemptible - apparently you have to create a new instance (leaving out the --preemptible option in step 3, as @paul pointed out above).


(Paul M) #414

If someone knowledgable with GCP is reading this thread, could you please summarize how to setup and use a resilient system for working through the fastai course? I will switch back to using AWS for the rest of this course, but I didn’t consider that GCP could lock me out of the ability to login to my codes and data. (I know it’s very silly of me to not keep a local backup of everything, but I never before experienced the lack of the ability to login to a service for 3 days in a row despite multiple daily attempts.)

A google poster at the stackoverflow thread that was linked earlier in this channel recommends going to 3 websites (see below), but doesn’t explain how to get through the steps of using a simple process, similar to what’s described in the GCP tutorial, so I wonder if there truly is a simple working solution available and what is its cost. I was very happy from the early experiences with GCP when the login to an instance simply worked every time, but I don’t want to spend a lot of time going through the documentation tree if this is not going to be a longer-term robust solution. Here are the links, in case anyone else can translate this to a simple update to the current fastai tutorial:

https://cloud.google.com/solutions/scalable-and-resilient-apps

https://cloud.google.com/compute/docs/regions-zones

https://cloud.google.com/compute/docs/tutorials/robustsystems


(Akshay Bhardwaj) #415

Did my setup on GCP before but didn’t use it again till now (was working before). Got an error now :frowning: Something regarding the instances not being available, went through the forum and found that I should probably use a different zone while creating the machine (everyone was using default in the docs, like me). Did so and machine working now! Yes!
P.S.: My first forum post. Baby steps.