How to set up env using Google Cloud?


#22

Thanks. I looked into cs231n and https://haroldsoh.com/2016/04/28/set-up-anaconda-ipython-tensorflow-julia-on-a-google-compute-engine-vm/ and got conda and tensorflow installed. I have Jupyter up and running as well now. I checked the link eshvk posted as well, but it had few scripts and I was not suere how/where to run them.

Hope I have everything I need now. you were awesome and thanks for quick responses.


#23

Hi - I was trying with google cloud with 2 vCpu (13GB memory, high memory version as well), but it takes so long (I waited for 15 mins and still sample folder from lesson1 does not complete). Can you pls advice, from your findings, on what is the min configuration for getting sample data to work and min configuration for actual data (ex: Lesson1 data) to work?

I was trying to see what configurations to use, to optimize the cost incurred. Thanks in advice.


#24

Did you attach an GPU? Have you checked if Theano is using a GPU?

I ran the first lessons on AWS, but the execution times should be similar on GCE, if you have everything set up properly.


#25

By the way, please don’t take this the wrong way, but have you read http://wiki.fast.ai/index.php/How_to_ask_for_Help

I’d like to help, but please try to solve a problem by yourself first, and give enough information on what you have tried, what the possible cause might be, et cetera, before asking for help. And do a search first, here on the forums, but also a Google search, to see if you can find an answer to your question yourself.


#26

Thanks. I tried all combinations without enabling GPU (as per post http://cs231n.github.io/gce-tutorial/), but it did not work. Since you were able to setup with GC, thought of seeking your help for suggestions. Since enabling GPU increases the cost, and I know I will make lot of mistakes during learning, wanted to see how I can maximise my hours available for ML and hence if it can be done without enabling GPU. Sorry for being a pain. You have been of great help and thanks a bunch.


#27

You need GPU’s for this course, this is mentioned by Jeremy in the first lesson, and it is also explained in the notes of the first lesson. You can do a part of the first lesson on a instance similar to t2.large, but it will take a long time.

Again, I love to help, but please watch the video and read the notes first.


Google Cloud Platform for fast.ai part1 v2
#28

wanted to see how I can maximise my hours available for ML

Dropping by here with a quick anecdotal note. I ran dogs vs cats enhanced on a single K80 GPU based instance, it took like maybe six minutes per training epoch.

I am not convinced the cost vs time benefit of using a CPU are that much more cheaper. Esp. for these lessons.


Error in lesson 1 : “AttributeError: (‘This name is already taken’, ‘floatX’)”
#29

Just in case anyone is still having issues getting the things running on Google Cloud, here are my steps for creating a DL instance from scratch. I’m using the script from https://github.com/fastai/courses. This uses Python 2.7, and Keras 1.2.2. I recommend using this config (instead of Python3 and Keras2) unless you know what you are doing.

STEP 1
I assume you already know how to create an instance on Google Cloud. See https://cloud.google.com/compute/docs/instances/create-start-instance if you don’t.

Create a n1-standard-1 instance with Ubuntu 16.04, a single GPU and a bootdisk of 20GB. Create the instance in a zone where GPU’s are available, see https://cloud.google.com/compute/docs/gpus/. You can use a different instance with more CPU or memory, if you want. Give the instance a network tag jupyter. We need this to create a firewall rule later.

Optionally, create and attach a persistent data disk. This is not required for the lessons, but it can be useful if you want to keep data or models when deleting the instance. I named the instance “deeplearning”, so this becomes the name of the boot disk too and I named the data-disk “deeplearning-data”.

STEP 2
Ssh into the instance.

STEP 3
Download the script that installs CUDA, Anaconda etc:
wget https://raw.githubusercontent.com/fastai/courses/master/setup/install-gpu.sh

STEP 4
Run the script:
sudo sh install-gpu.sh
At the end you need to pick a password for the jupyter notebook. This script also clones the course materials from https://github.com/fastai/courses/.

STEP 5
reboot, either using the reboot command or the reset option on the console:
sudo reboot

STEP 6
Create a firewall rule for accessing port 8888 from your local machine, using the console or the command line:

export PROJECT="project_name"
export YOUR_IP="enter_the_ip_of_your_local_machine"
gcloud beta compute --project "${PROJECT}" firewall-rules create "jupyter" --allow tcp:8888 --direction "INGRESS" --priority "1000" --network "default" --source-ranges "${YOUR_IP}" --target-tags "jupyter"

STEP 6
When the instance has restarted, ssh into the instance again and check if CUDA is installed properly:

sudo modprobe nvidia
nvidia-smi

STEP 7
Run jupyter notebook:
jupyter notebook --ip=0.0.0.0 --port=8888
Note the token displayed in the terminal. Go to the notebook using the external IP of your instance.
At this point everything is set up for doing the lessons.

STEP 8 (optional)
Format and mount the data-disk with the following commands. This mounts the data disk to the /opt/my_data directory. Feel free to use a different location.

export MOUNT_DIR=/opt/my_data
# ${hostname} is the name of the instance
export DISK_NAME=${hostname}-data
export DISK_MOUNT_POINT=/dev/disk/by-id/google-${DISK_NAME}

sudo mkdir -p ${MOUNT_DIR}
sudo chmod 777 ${MOUNT_DIR}

# uncomment next line to format disk, don't do this if your data-disk already has data
# mkfs.ext4 -F ${DISK_MOUNT_POINT}
sudo mount -o discard,defaults ${DISK_MOUNT_POINT} ${MOUNT_DIR}

backup fstab
sudo cp /etc/fstab /etc/fstab_old

# change fstab so the disk is mounted after a reboot
printf "${DISK_MOUNT_POINT} ${MOUNT_DIR} ext4 defaults 0 0\n" | sudo tee -a /etc/fstab

(Brookie Guzder-Williams) #30

This is my setup:

Theres as setup script you need to upload to your instance and a readme on how to do the rest


(Brookie Guzder-Williams) #31

Someone mentioned I needed to add the pip installation - just updated the markdown doc


(Santhosh Kumar) #32

Please add

#sudo apt-get install git

also to the script


(Santhosh Kumar) #33

When I try to install gpu in my google cloud platform using the script, it fails at step

$sudo apt-get -y install cuda

and error message is

$ sudo apt-get -y install cuda
Reading package lists… Done
Building dependency tree
Reading state information… Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
cuda : Depends: cuda-8-0 (>= 8.0.61) but it is not going to be installed
E: Unable to correct problems, you have held broken packages.


(mele) #34

Thank you sebastian for your instruction.
I finished up to step 7, but I can’t access jupyter notebok(address :external IP:8888?tokenXXXXXXXXX).
Error messages are access denied or too long to respond.
I suspect it’s the problem of setting firewall rules.
I used your step6 firewall rule and then tried another (such like --source-ranges 0.0.0.0/0 )
How can I fix it?


#35

Did you start Jupyter with:
jupyter notebook --ip=0.0.0.0 --port=8888?

The firewall settings should look similar to this:

The source IP range should have the IP address of your local machine. (So not 0.0.0.0).

And finally, did you add the tag jupyter to your instance?


(mele) #36

Thank you very much!!
I missed adding the tag to my instance. Now I can access my jupyter notebook.
I really appreciate your replay.


#37

No problem, glad you got it working!


(Mahon Baldwin) #38

I’ve been following along to get my instance running too. I’m wondering how to setup the network rules so that I don’t have to change the rule every time my ISP changes my IP address.


#39

Maybe create a bash script that changes the firewall settings automatically, before you start jupyter?

Or you’ll have to look at https://jupyterhub.readthedocs.io/en/latest/ for securing jupyter itself, instead of relying on firewall settings for security.


(Brookie Guzder-Williams) #40

Hey Everyone. I moved my gist over to full repo and made a few changes:

In the setup script I added the pip installation, fixed a bug or two and included the conda py2 environment. I then added a create_instance script. So at this point the main setup comes down to this…

# local env
$ . create_instance.sh gpu-84 4
$ gcloud compute copy-files gpu-setup.sh gpu-84:~/

# remote instance
# Note: this sets up a Py3 environment with Keras 2.  It also creates Py2 enviroment with Keras 1 `source activate py2`.
$ . gpu-setup.sh

There are a couple things to cut and paste from the readme to get CUDNN installed but it runs fast. The readme also has some checks to run, info on jupyter pwd, syncing sublime.


#41

Hi @sebastian,

Thank you for the awesome steps to work with GCP. I have followed your steps and stuck in the last step of connecting to the jupyter notebook. I am getting the error- “104.198.15.91 took too long to respond” in the browser.As you suggested in your another post, i have checked the firewall rules and they are also as expected.
Any idea what might be going wrong here?
below is the snapshot of firewall rule: