Platform: Google Cloud Platform (GCP)

@micstan I’ve talked to my team mate who is managing GCP and the project owner, we’ve tried to do it within his user but couldn’t install it.
We’ve tried to install from the root folder but with no success (is that what you meant for root?)
Also tried the sudo -v and sudo passwd but nothing happen

@mrgold by root i mean only sudo execution. Since the machine is created within your company project my best guess is that there are some security limitations on sudo permissions and you may want to coordinate with your backend/admin. I haven’t faced it before, both privately and in my company VMs by default allow to run with superuser privileges (you may try to check what you get with sudo whoami).

I was trying to help somebody else get set up with the new GCP AI Platform. It seems that the fastai documentation points to the Compute Engine quota increase website, and not the AI Platform quota increase website.

I’m honestly not sure how the process differs since I’m still using the old Compute Engine.

hey @jwuphysics just answered on github, so the API manager redirects now to Admin Quotas page and imho the instruction in the link you shared is not valid anymore, i guess they did not update the docs for AI platform. AI Platform uses normal compute engine VMs and you request quotas in the exactly same way as described in https://cloud.google.com/docs/quota#requesting_higher_quota (Platform: Google Cloud Platform (GCP))

Hi @micstan

Thanks for the update, you’re absolutely right. I was encountering an unrelated error that has now been sorted out.

Pre-emptible instances are more expensive and I feel like this is Google’s way of making more money off of you, but I guess we just have to suck it up and do that! This worked for me as well. Thanks.

When I was installing CUDA 10.2 per the GCP start-up guide I kept hitting the error “Unable to load the kernel module ‘nvidia.ko’”. The logs were not informative, and a bit of googling led me to suggestions that I needed to disable the nouveau driver.

I disabled nouveau using these steps: https://linuxconfig.org/how-to-disable-nouveau-nvidia-driver-on-ubuntu-18-04-bionic-beaver-linux
However, I still got the same error when trying to run the 10.2 installer.

What ultimately fixed it for me was first uninstalling 10.1 and the nvidia drivers, then re-running the 10.2 installer.
sudo /usr/local/cuda-10.1/bin/cuda-uninstaller
sudo /usr/bin/nvidia-uninstall

Note: I didn’t test if solely uninstalling the drivers fixed the issue or if disabling nouveau was also necessary.

Hope its helpful

Hi all,

I have followed the process linked to from within the fastai book for setup on Google Cloud Platform: https://course.fast.ai/start_gcp

The pip command fails with some compatibility error. It looks like that doesn’t happen with pip 3. I didn’t record the error - because the CUDA Update problem seemed make the notebook server fail on next boot. If anyone is interested. I’m OK to redo the process to record it.

Step 4: Install libraries fails.

I get the following error when doing the CUDA Update:

(base) jupyter@fastai-v2-2020-11-02:~$ cat /var/log/cuda-installer.log
[INFO]: Driver not installed.
[INFO]: Checking compiler version...
[INFO]: gcc location: /usr/bin/gcc

[INFO]: gcc version: gcc version 6.3.0 20170516 (Debian 6.3.0-18+deb9u1) 

[INFO]: Initializing menu
[INFO]: Setup complete
[INFO]: Components to install: 
[INFO]: Driver
[INFO]: 440.33.01
[INFO]: Executing NVIDIA-Linux-x86_64-440.33.01.run --ui=none --no-questions --accept-license --disable-nouveau --no-cc-version-check --install-libglvnd  2>&1
[INFO]: Finished with code: 256
[ERROR]: Install of driver component failed.
[ERROR]: Install of 440.33.01 failed, quitting

I have successfully tried using the method described at https://course19.fast.ai/start_gcp.html
but I’m not sure how compatible that platform is with fastbook or Part1_2020 or part1_V3 – or indeed which I should be using…

Thanks --PG

I ran into the same issue as @banksiaboy on GCP. My first problem was that I had not provisioned the instance with a GPU. After fixing that I followed @ph147’s instructions as below and was able to get things running.

1 Like

@jeremy I’m trying to set up a notebook instance on GCP, but I can’t find information on what options I should use to set up the machine. I’m on the following page:

This is the information I need:

  • Operating System
  • Environment
  • Machine type
  • GPU type
  • Boot disk type
  • Boot disk size
  • Data disk type
  • Data disk size

Please help!

I just setup a new GCP notebook instance with the following specs

I think fastai comes installed with this setup but I went through the fastai+fastbook install steps in the setup doc anyway. I didn’t go through the CUDA update steps since I believe it’s on CUDA 11.

I’m having trouble running the notebooks, I believe the kernel is crashing when trying to import fastbook. This is what I see in interactive mode.

Python 3.7.8 | packaged by conda-forge | (default, Jul 31 2020, 02:25:08) 
[GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import fastbook
terminate called after throwing an instance of 'std::runtime_error'
  what():  generic_type: cannot initialize type "WorkerId": an object with that name is already defined
Aborted

Does anyone have suggestions?

So I think the issue was I ended up with multiple PyTorch versions and other dependency funkiness. I recreated an instance with this environment

And installed libraries like so:

conda install -c fastai -c pytorch fastai
conda install -c fastai fastbook
pip install azure-cognitiveservices-search-imagesearch
git clone https://github.com/fastai/fastbook.git

Seems to work now.

Hi all,

I am trying to set up fastai on GCP, it gave me below comment when i tried to install
" conda install -c fastai -c pytorch fastai"
Anyone encounter this issue? I am not able to locate the file mentioned which i couldn’t delete it manually.

Executing transaction: - WARNING conda.gateways.disk.delete:unlink_or_rename_to_trash(140): Could not remove or rename /opt/conda/lib/python3.7/site-packages/torch/pycache/init.cpython-37.pyc. Please remove this file manually (you may need to reboot to free file handles)
done
ERROR conda.core.link:_execute(700): An error occurred while installing package ‘pytorch::pytorch-1.0.0-py3.7_cuda9.0.176_cudnn7.4.1_1’.
Rolling back transaction: done

[Errno 13] Permission denied: ‘/opt/conda/lib/python3.7/site-packages/torch/pycache/init.cpython-37.pyc’
()

I made an easy step-by-step guide to create AI-platform notebook instance with everything up to date to use fastai, I hope it helps the ones don’t know what configuration to choose and how to start.

5 Likes

Is there a way I may store programs & data on a disk(bucket?) & switch between different VMs depending on whether I’m coding(low cost VM) or ‘learning’(expensive GPU VMs). I’m looking to keep a stable codebase+data on cheaper storage and switch between levels of compute as needed.

Would appreciate any help or pointer to resources. Thanks

This may be what you are looking for: Connecting to Cloud Storage buckets  |  Compute Engine Documentation

Thank you so much! This is the solution that finally worked for me.

In early July of 2021, the instructions provided for the GCP setup here are no longer correct. In particular, there is no longer a need to manually install a different CUDA version (the default in GCP is now 11.0, and this works out of the box with pytorch).

Using the other code in the official GCP setup page leads to weird library incompatibilities (in particular one that other people have mentioned which begins: “NameError: name ‘CallbackHandler’ is not defined”).

When I used the setup approach described on your github page, I was able to install everything with no issues and run the code in the notebook (only tested the first lesson so far but all of the code runs)!

2 Likes

nice! I am glad that it was useful for you!

Just used this too. Thanks! so easy to follow and works perfectly :grinning:

Wow, I’m incredibly grateful to you for putting this together. It would be nice if this could get some official spotlighting as well. Thanks so much for taking the time!