Platform: GCP βœ…


(Arunoda Susiripala) #459

I had that issue too with the official image. So then I switch backs to install fastai from scratch into an ubuntu.


(Sander de Ruiter) #460

Thank you. Can you link to an install guide, or something similar?
My best guess now is to delete my instance and just start over (using the fast.ai GCP tutorial).


(Arunoda Susiripala) #461

I use this: https://github.com/arunoda/fastai-shell


(Sander de Ruiter) #462

Thanks, that works wonders. Question on this: where do you store you own work? On my other instance I stored it in the same dl1 folder, but that messes up the git pull command (I need to stash and stash pop to get git pull working).


#463

I have not been able to start my GPU machine on GCP since the past couple of days.
Its says β€œquota exceeded globally”. I have started the machine with out GPU to do some data pre-processing but, did not do any model training.

Is any one in the same situation as me?
Should I move to AWS?


(Arunoda Susiripala) #464

You need to increase your GPU quota and upgrade your account.
Search this thread for more info.


(Arunoda Susiripala) #465

You can save anywhere inside the your instance.
But I recommend do it in a different directory and push changes to GitHub.


(Sander de Ruiter) #466

Question: sgugger has committed a fix to the master branch, that is currently not in the latest (tagged) release. As of writing, 1.0.27 is the latest tag, which I have, using the update_fastai.sh script.

Do you know of a quick way for me to download the master branch, while still maintaining the possibility to use update_fastai.sh if needed?


(Arunoda Susiripala) #467

Simply do this: https://github.com/fastai/fastai#developer-install

Do this after update-fastai.sh.
(Which updates pytorch)


(Sander de Ruiter) #468

Perfect! After doing this, version shows 1.0.28.dev.
What would be the procedure to reverse this again?


(Arunoda Susiripala) #469

Just install the fastai via conda.
Or just run the update-fastai.sh script.

You can also checkout a release tag in the repo instead of checkout the master.


(Deepanshu Thakur) #470

While trying to create a V100 instance I am receiving this error Quota 'GPUS_ALL_REGIONS' exceeded. Limit: 0.0 globally.


(An) #471

Check your GPU quota settings (IAM& admin->Quotas).
If you have to change quota for GPU, just write a ticket (choose GPU quota->Edit).


(Haider Alwasiti) #472

When you get error when SSH to a GCP instance:

[Connection Refused]

The solution is to excute this line in your pc bash:
gcloud compute routes create default-internet --destination-range 0.0.0.0/0 --next-hop-gateway default-internet-gateway

This is because the default route for non-local traffic (0.0.0.0/0) had been inadvertently deleted, which caused all external traffic to be lost on the return path.

Source


(Deepanshu Thakur) #473

@tillia I cannot see any quota related to GPU in my IAM & Admin -> Quotas


(An) #474

Start filter in Metrics dropdown by β€˜GPU’ - you should see all GPU related quotas.


(Deepanshu Thakur) #475

@tillia In filters I see all the GPU and they are enabled (blue tick in front of them)


(An) #476

You have to filter only GPU quotas (in dropdown None, then filter by GPU and check quotas you wanna select). You should have all GPU quotas listed and on the right side of table should be actual quota for each row. Use checkbox in quotas you want to change and then select edit and write a ticket :slight_smile:


(Marc Rostock) #478

Just in case someone encounters this problem in gcp: Learning is very slow, because pytorch only uses one process, even though you specified num_workers = x (>1) (normally fastai does this for you by default with x = num cpus). This seems to be a bug in older pytorch versions (also the one that came preinstalled with the official image I used according to fastai docs.)

Upgrade pytorch with conda install pytorch-nightly -c pytorch, not conda update (which will tell you it is on the latest already), and the problem will be solved and you will get multiple workers and faster training. (works at least with version build pytorch.dev2018-11-30)


(EPHRAIM AYEMERE) #479

I tried updating the course but got this error. help pls?

jupyter@instance-1:~$ cd tutorials/fastai/course-v3

-bash: cd: tutorials/fastai/course-v3: No such file or directory
jupyter@instance-1:~$