Lesson 1 using Google Cloud VM( step by step installation with shell script)

Here is step by step process with shell script for people who have free Google Cloud credits and want to use the Google Cloud VM. The shell script installs Keras, Pytorch with GPU support. The minimum GPU card that Google has is Tesla K80. It also supports Tesla P100. I managed to do run lesson 1 using 1 K80 with 4 CPU and 100GB of storage.

Step - 0: https://cloud.google.com/free/ .Check whether you are eligible for free $300 Google Cloud option.
Step - 1: Go to www.console.cloud.google.com. You will be greeted with this screen

Step -2: Move cursor to the left corner and select VM instances from Compute Engine.

Step -3: Click Create Instance for creating new VM.

Step -4: Customize the VM as you wish . I have selected Ubuntu 16.04 with 100 GM storage - 4 CPU with 15 GB CPU RAM . There is also 1 K80 GPU selected at the bottom.



Click Create at the bottom.

Step -5: Click SSH underneath Connect to the desirable instance.

. A shell will pop up in a separate window.

Step - 6: Copy paste / Cut paste from the script below - or you can make myscript.sh file and run it from terminal with the code below using sh myscript.sh

# This script is designed to work with ubuntu 16.04 LTS
# with keras 1.2.2 and the latest Pytorch with CUDA 8 support 
##########################################################################
#This is used to install CUDA 8 driver for Tesla K80 
##########################################################################

#!/bin/bash
echo "Checking for CUDA and installing."
# Check for CUDA and try to install.
if ! dpkg-query -W cuda-8-0; then
  # The 16.04 installer works with 16.10.
  curl -O http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
  dpkg -i ./cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
  apt-get update
  apt-get install cuda-8-0 -y
fi

#########################################################################

#############################################################################
#Updating the system
#############################################################################

sudo apt-get update
sudo apt-get --assume-yes upgrade
sudo apt-get --assume-yes install tmux build-essential gcc g++ make binutils
sudo apt-get --assume-yes install software-properties-common

#########################################################################################################################
#Installing anaconda with the required packages
#########################################################################################################################

wget "https://repo.continuum.io/archive/Anaconda3-4.3.0-Linux-x86_64.sh" -O "Anaconda3-4.3.0-Linux-x86_64.sh"
bash Anaconda3-4.3.0-Linux-x86_64.sh -b
echo "export PATH=\"$HOME/anaconda3/bin:\$PATH\"" >> ~/.bashrc
export PATH="$HOME/anaconda3/bin:$PATH"
conda install -y bcolz
conda upgrade -y --all

#########################################################################################################################
#Installing keras with tensorflow , as well kaggle client
#########################################################################################################################

pip install keras==1.2.2
pip install tensorflow
pip install kaggle-cli

#You can confgure your kaggle account details here
#kg config -u 'username' -p 'password' -c 'dogs-vs-cats-redux-kernels-edition'
#kg download


#########################################################################################################################
#Installing Jupyter notebook
#########################################################################################################################

# configure jupyter and prompt for password
jupyter notebook --generate-config
jupass=`python -c "from notebook.auth import passwd; print(passwd())"`
echo "c.NotebookApp.password = u'"$jupass"'" >> $HOME/.jupyter/jupyter_notebook_config.py
echo "c.NotebookApp.ip = '*'
c.NotebookApp.open_browser = False" >> $HOME/.jupyter/jupyter_notebook_config.py
echo "\"jupyter notebook\" will start Jupyter on port 8888"
echo "If you get an error instead, try restarting your session so your $PATH is updated"

#########################################################################################################################
#Downloading the old courses Fast AI 1 and 2 - and as well as new Fast AI 1
#########################################################################################################################

cd ~
git clone https://github.com/fastai/courses.git
git clone https://github.com/fastai/fastai.git

#########################################################################################################################
#Installing google compute engine package, unzip package, and gensim package( it is useful for Fast AI Part 2 - 2017)
#########################################################################################################################

sudo apt-get install unzip
pip install --upgrade gensim
pip install google-compute-engine

#########################################################################################################################
#Installing all the relevant packages for doing lesson 1 of Fast AI Part 1 - 2017
#########################################################################################################################

conda install pytorch torchvision cuda80 -c soumith
pip install torchtext
conda install opencv
pip install isoweek
pip install pandas_summary

Step -7: Finally most important code. In your terminal write source .bashrc. Press enter and you are good to go.

Step 8: To run Jupyter notebook - type jupyter notebook &. Then using the ip address from Step 5- open a browser and type:http:// ip-adress: 8888 .

If somebody has already finished the free trial and is interested to invest more.

(2 CPU with 100 GB storage) $563.54 /month estimated Hourly rate $0.772 (730 hours/ month)*
(2 CPU with 300 GB storage) $571.54 /month estimated Hourly rate $0.783 (730 hours/month)*

(4 CPU with 100 GB storage) $612.09 /month estimated Hourly rate $0.838 (730 hours / month)*
(4 CPU with 300 GB storage) $620.09 /month estimated Hourly rate $0.849 (730 hours/ month)*

*with GPU - Tesla K80

26 Likes

Thanks for doing this! BTW might be good to add a link to this post from the lesson 1 wiki thread.

4 Likes

Google Cloud GPU’s are cheaper now.

In US regions, each K80 GPU attached to a VM is priced at $0.45 per hour while each P100 costs $1.46 per hour.

–

3 Likes

First of all, thank you to @mmr for creating the step-by-step guide. I was following this guide previously and it works. However, today, AI Saturdays has created a comprehensive guide on how to setup Google Cloud Platform (GCP) for part 1 version 2 (v2).

For those visiting this topic, I think they should check out the new guide for a more comprehensive experience. The guide is using the Paperspace script with minor tweaks to support GCP. The tweaks are:

  1. Replaced the message asking Paperspace user to reboot their computer with a new command to reboot GCP instance.
  2. Modified firewall configuration command: sudo ufw allow 8888:8898/tcp --> sudo ufw allow 8888/tcp
  3. Added #! /bin/bash

Update 2018-01-04 6:20 AM SGT: I have tested & verified this guide. The Paperspace shell script works perfectly for Google Cloud VM instance with a fresh Ubuntu 16 LTS install. I can connect & access Jupyter Notebook in my browser.

Update 2018-01-04 12:00 PM SGT: Tested with lesson 1 notebook (lesson1.ipynb). Jupyter Notebook kernel keeps dying when I try to train a dogs vs cats model in 3 lines of code that looks like:

data = ImageClassifierData.from_paths(PATH, tfms=tfms_from_model(arch, sz))
learn = ConvLearner.pretrained(arch, data, precompute=True)
learn.fit(0.01, 3)

When that happened, you will see a message that sounds like “Kernel Restarting. The kernel appears to have died. It will restart automatically.”. Upon investigation, I found out the cause was due to my GCP instance/server runs out of memory (OOM). I was using n1-standard-2 (1 vCPUs, 3.75 GB memory). To resolve this issue, you must stop the VM instance to edit its machine type. I upgraded my VM instance to n1-standard-2 (2 vCPUs, 7.5 GB memory) and things just work fine for now.

12 Likes

this is the correct script. There is a problem with driver version in the first script. Thanks.

You can secure your jupyter notebook with a certificate http://jupyter-notebook.readthedocs.io/en/stable/public_server.html

If you want to add something to the post, do it. I would be happy to edit my post.

Sure. Some of the difference I found,

if ! dpkg-query -W cuda-8-0; then

↑ this fails without sudo

curl -O http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_8.0.61-1_amd64.deb

↑ the other script is using version 9.0.176-1.

dpkg -i ./cuda-repo-ubuntu1604_8.0.61-1_amd64.deb

↑ version change

Again, other script installs few other libs. Not sure where it is required.

sudo apt -y install qtdeclarative5-dev qml-module-qtquick-controls
sudo add-apt-repository ppa:graphics-drivers/ppa -y
wget http://files.fast.ai/files/cudnn-9.1-linux-x64-v7.tgz
tar xf cudnn-9.1-linux-x64-v7.tgz

A reboot is required after all installation.

for checking driver is correctly installed, we can use cmd nvidia-smi

Thanks.

1 Like

Hi. A gentle reminder. Please follow these steps on how to ask for help to ensure that you get a helpful answer. Thanks.

http://wiki.fast.ai/index.php/How_to_ask_for_Help

That would happen if the firewall isn’t configured correctly. Check step 7 again in the guide and let us know what you find. :slight_smile:

Try with http:// not https://

No problem. Please try the possible solutions from jparkrr and mmr and report back your findings. High chances the problem might be due to https in the Jupyter Notebook URL as what mmr spotted.

@mmr thanks a lot for creating this - I’ve turned it into a wiki now so that others can edit it to help keep it up to date. I hope that’s OK - let me know if you’d like me to revert that change.

@cedric those are great suggestions for changes. Adding cudnn is particularly important. Do you want to edit the original post directly to make those fixes?

No problem.

OK, I will edit the original post directly to include those suggestions and fixes soon. Thanks.

FWIW, I had to use a n1-standard-4 (4 vCPUs, 15 GB memory) instance to make it through lesson 1 training. n1-standard-2 still crashed.

@mmr may I know how to calculate these estimates/ how you calculated them? right now I am using google calculator and the estimate is coming out to be scarily humungous!

If somebody has already finished the free trial and is interested to invest more.

(2 CPU with 100 GB storage) $563.54 /month estimated Hourly rate $0.772 (730 hours/ month)*
(2 CPU with 300 GB storage) $571.54 /month estimated Hourly rate $0.783 (730 hours/month)*

(4 CPU with 100 GB storage) $612.09 /month estimated Hourly rate $0.838 (730 hours / month)*
(4 CPU with 300 GB storage) $620.09 /month estimated Hourly rate $0.849 (730 hours/ month)*

It’s seems like that using GPU with the free 300 usd is not possible anymore, I have just tried and it’s specified in https://cloud.google.com/free/docs/frequently-asked-questions

You cannot add GPUs to your Compute Engine instances.

Thanks @steffenix for putting the link here. I just started the course. I followed several Posts that share the steps to setup GCP for this course without going through the Free Tier documentation updates. Each time i raise a Quota request, i got back a mail with upfront payment steps. Guess we cant use GCP Free Tier for this course anymore.:expressionless:

Seems like you have to upgrade the account to paid account (but you can keep on using the free 300usd) . Then you will have to ask for a quota change https://console.cloud.google.com/iam-admin/quotas

After I’ve got an automated email asking me to make a payment of $35 US to help us ensure that this is a legitimate request.

I have spent 35 usd and I have made the request to increase the quota, I have now access to GPUs.