Platform: Google Cloud Platform (GCP)

I have tried to set up GCP machine as advised and run into the following problems:

  1. Quotas: For “n1-highmem-16” instance you need 16 CPUs but the default quota is 8 (solution: increase CPU for “us-west1-b” from 8 to 16 and overall CPU quota from 12 to 16)
  2. CUDA driver: Installation of fastai2 upgrades torch from 1.4.0 to 1.6.0 and you get The NVIDIA driver on your system is too old AssertionError (solution: update CUDA Toolkit to 11.0)

After that, everything seems to work fine.

1 Like

Hi, I notice that GCP instructions are no longer included in the new course — any idea why?

1 Like

The ‘course.fast.ai’ site used to come from the ‘https://github.com/fastai/course-v3’ repository, but with the new course, the repository got switched to ‘https://github.com/fastai/course20’ which does not (yet?) include deployment documentation.

I’m using GCP and got everything running.
When I execute the first cell (imports) in the notebooks (e.g. in notebook 01_intro) I get the following ImportError:

ImportError: cannot import name ‘mobilenet_v2’ from ‘torchvision.models’ (/opt/conda/lib/python3.7/site-packages/torchvision/models/init.py)

I pip installed the fastai2 library and updated it again. The error is still the same.
Any idea?

I believe the instructions at the top of the thread are a bit out of date. I ran into the same issue and resolved it by uninstalling all of the existing fastai packages and then installing fastai (not fastai2 – the latest version of the fastai package is fastai v2). Then the code ran (though there are still some outstanding problems in my setup I’m trying to resolve).

This how-to post was also posted in the Fastai2 and new course thread, but it seemed more appropriate here in the GCP Platform thread.

Please note that fastai 2 requires torch-1.6.0 and torchvision-0.7.0. The cuda drivers on the platform image are 10.1 and too old for torch-1.6.0. There are no pytorch images in the deeplearning-platform-release family with 10.2 or 11 cuda drivers. However the 10.2 drivers can be updated per @micstan

Here are the steps I followed to setup the GCP image with the new release and the book:

Follow the old GCP setup guide here: http://course19.fast.ai/start_gcp.html

Open a terminal. I use Putty

Login to terminal: gcloud compute ssh --zone “us-central1-b” “jupyter@fastai-4” – -L 8080:localhost:8080

Install 10.2 cuda

wget http://developer.download.nvidia.com/compute/cuda/10.2/Prod/local_installers/cuda_10.2.89_440.33.01_linux.run
sudo sh cuda_10.2.89_440.33.01_linux.run

Install fastai 2, fastcore and fastbook

cd tutorials
mv fastai fastai.old
git clone --recurse-submodules https://github.com/fastai/fastai
pip install -e “fastai[dev]”
git clone --recurse-submodules https://github.com/fastai/fastcore
cd fastcore
pip install -e “.[dev]”
cd …
git clone https://github.com/fastai/fastbook.git
cd fastbook
pip install -r requirements.txt
cd …

Check to see if pytorch and cuda are happy

python -c ‘import torch; print(torch.__version__); print(torch.version.cuda); print(torch.cuda.is_available()); print(torch.cuda.current_device())’

Test a few notebooks in the course and the fastbook folders

Launch local browser: http://localhost:8080/tree/tutorials

Verify the notebooks run

Run notebook from: http://localhost:8080/tree/tutorials/fastai/dev_nbs/course
Run notebook from: http://localhost:8080/notebooks/tutorials/fastbook/

Run a notebook with training to make sure the gpu is being used by looking at the training times for the epochs and checking out the sm and mem columns output from nvdia-smi dmon

nvidia-smi dmon

Perhaps others will have a more elegant solution, but for something quick to get started I haven’t run into any issues running fastai v2 and notebooks on GCP this way.

Cheers and many thanks for all the stellar work on the course, book and API! Mark

6 Likes

@markphillips , i just initiated a doc for the setup (https://github.com/fastai/course20/pull/9). Google Cloud has now a simplified method using “AI platform” that i believe can be easier for many people. Please feel free to modify. This one works for me but i’m not an expert in GCP so if others could edit it would be great.

1 Like

Nice! I’ll bet you’re right about this being easier for many people :grinning: An easier option with less knobs and an option with pretty much everything should pretty much cover all the bases.

Thanks for this, Mark! Looks like it’s working for me now. Minor nit, it’s print(torch.version.cuda). Thanks again!

Cool! I forgot to escape my underscores :grinning: print(torch.__version__);

1 Like

Is your post above necessary to use a Tesla T4 on GCP? When I try to run
language_model_learner(qdl, AWD_LSTM, metrics=[accuracy, Perplexity()], wd=0.1).to_fp16()

I get the error AssertionError: Mixed-precision training requires a GPU, remove the call to_fp16``.

I am following the examples on the text tutorial.

print(torch.__version__); print(torch.version.cuda); print(torch.cuda.is_available())
1.6.0 10.2 False

UPDATE: I followed the directions to install 10.2 cuda and it is working now. I did not have to follow all of the extra directions for github.

Glad it’s working with the new drivers! As you note, the github install is not required unless you want an editable version of fastai 2 and fastcore. If you don’t want that you could simplify the process by using conda:

conda install -c fastai -c pytorch fastai

If you want the fastbook the github clone will get the notebooks for that…

Guys, GCP is denying my request for GPU quota increase. Any idea why this may be happening? Why does this even require an approval process?

1 Like

@deep-learner Not all models are available in all regions. It can be one of the reasons you got rejected. There are some advices here: https://groups.google.com/g/gce-discussion/c/UWpvMNqkVjc?pli=1

GCP is being more strict about this kind of stuff. Try to have fairly conservative requests and also maybe comment that you are using for fastai course. If they deny it, try to follow up with them as well.

Update: I followed up with them and told them that it’s not possible to get a quota increase for an individual. They said it has to be for a company account (not a gmail account) with a 12 month minimum project. Can anybody else try this out and share their experience please? This seems very weird to me.

1 Like

@deep-learner what is your quota request?

So are you unable to get any access to GPUs now?

I followed these instructions and requested to increase “GPUs (all regions)” from 0 to 1. Request got denied, then I was told that ALL quota requests coming from gmail accounts will get denied (sales rep may have been wrong, but that’s what I was told).

I am curious to see if any other new users are able to get a GPU. Mine is just one incident. We should wait to see some more reports. If this trend continues, maybe it’s time to stop recommending GCP for the fastai course.