I have tried to set up GCP machine as advised and run into the following problems:
Quotas: For “n1-highmem-16” instance you need 16 CPUs but the default quota is 8 (solution: increase CPU for “us-west1-b” from 8 to 16 and overall CPU quota from 12 to 16)
CUDA driver: Installation of fastai2 upgrades torch from 1.4.0 to 1.6.0 and you get The NVIDIA driver on your system is too old AssertionError (solution: update CUDA Toolkit to 11.0)
I’m using GCP and got everything running.
When I execute the first cell (imports) in the notebooks (e.g. in notebook 01_intro) I get the following ImportError:
ImportError: cannot import name ‘mobilenet_v2’ from ‘torchvision.models’ (/opt/conda/lib/python3.7/site-packages/torchvision/models/init.py)
I pip installed the fastai2 library and updated it again. The error is still the same.
Any idea?
I believe the instructions at the top of the thread are a bit out of date. I ran into the same issue and resolved it by uninstalling all of the existing fastai packages and then installing fastai (not fastai2 – the latest version of the fastai package is fastai v2). Then the code ran (though there are still some outstanding problems in my setup I’m trying to resolve).
This how-to post was also posted in the Fastai2 and new course thread, but it seemed more appropriate here in the GCP Platform thread.
Please note that fastai 2 requires torch-1.6.0 and torchvision-0.7.0. The cuda drivers on the platform image are 10.1 and too old for torch-1.6.0. There are no pytorch images in the deeplearning-platform-release family with 10.2 or 11 cuda drivers. However the 10.2 drivers can be updated per @micstan
Here are the steps I followed to setup the GCP image with the new release and the book:
Run a notebook with training to make sure the gpu is being used by looking at the training times for the epochs and checking out the sm and mem columns output from nvdia-smi dmon
nvidia-smi dmon
Perhaps others will have a more elegant solution, but for something quick to get started I haven’t run into any issues running fastai v2 and notebooks on GCP this way.
Cheers and many thanks for all the stellar work on the course, book and API! Mark
@markphillips , i just initiated a doc for the setup (https://github.com/fastai/course20/pull/9). Google Cloud has now a simplified method using “AI platform” that i believe can be easier for many people. Please feel free to modify. This one works for me but i’m not an expert in GCP so if others could edit it would be great.
Nice! I’ll bet you’re right about this being easier for many people An easier option with less knobs and an option with pretty much everything should pretty much cover all the bases.
Is your post above necessary to use a Tesla T4 on GCP? When I try to run language_model_learner(qdl, AWD_LSTM, metrics=[accuracy, Perplexity()], wd=0.1).to_fp16()
I get the error AssertionError: Mixed-precision training requires a GPU, remove the call to_fp16``.
Glad it’s working with the new drivers! As you note, the github install is not required unless you want an editable version of fastai 2 and fastcore. If you don’t want that you could simplify the process by using conda:
conda install -c fastai -c pytorch fastai
If you want the fastbook the github clone will get the notebooks for that…
GCP is being more strict about this kind of stuff. Try to have fairly conservative requests and also maybe comment that you are using for fastai course. If they deny it, try to follow up with them as well.
Update: I followed up with them and told them that it’s not possible to get a quota increase for an individual. They said it has to be for a company account (not a gmail account) with a 12 month minimum project. Can anybody else try this out and share their experience please? This seems very weird to me.
I followed these instructions and requested to increase “GPUs (all regions)” from 0 to 1. Request got denied, then I was told that ALL quota requests coming from gmail accounts will get denied (sales rep may have been wrong, but that’s what I was told).
I am curious to see if any other new users are able to get a GPU. Mine is just one incident. We should wait to see some more reports. If this trend continues, maybe it’s time to stop recommending GCP for the fastai course.