Quick Google Colab setup for Part 2 week 1 along with pascal VOC dataset

(Sourabh Dattawad) #1

The following script handles fast.ai setup along with dataset required for Part 2 week 1. Open your notebook, turn on the GPU and just run the script.

!pip install https://github.com/fastai/fastai/archive/master.zip
!pip install opencv-python
!apt update && apt install -y libsm6 libxext6
!pip3 install http://download.pytorch.org/whl/cu80/torch-0.3.0.post4-cp36-cp36m-linux_x86_64.whl 
!pip3 install torchvision
!mkdir data
!wget http://pjreddie.com/media/files/VOCtrainval_06-Nov-2007.tar -P data/
!wget https://storage.googleapis.com/coco-dataset/external/PASCAL_VOC.zip -P data/
!tar -xf data/VOCtrainval_06-Nov-2007.tar -C data/
!unzip data/PASCAL_VOC.zip -d data/
!rm -rf data/PASCAL_VOC.zip data/VOCtrainval_06-Nov-2007.tar

Thank you @binga for suggesting the edit.

Making the most out of Part 2 v2
(Vitaly Bushaev) #2

Thanks! It’s very useful!

(WG) #3

The Google Colab thing is kinda amazing!

Am I missing something or is it free? What are the restrictions/limitations and how does it compare with using AWS, Paperspace, Crestle, et. al.???

(Vitaly Bushaev) #4

it is free. they only let you use GPU for 12 hours straight. And if connection is ever lost, you have to start all over again.

(Vitaly Bushaev) #5

Also I’ve heard reports that GPU memory is often shared and it’s only luck which decides how much memory you’ll have :slight_smile:

(Nikhil B ) #6

Great resource, will give it a try. Hopefully, the first lesson shouldnt cause resource issues!

(WG) #7

Still … in terms of getting up and running quickly, this kinda blew my mind.

I was fully prepared to encounter and troubleshoot a bunch of issues, but instead I can just start coding. It may not be the ideal setup for ML solutions long term, but imho it should be the de facto environment for folks starting with part 1 of the course.

(Phani Srikanth) #8

On the contrary, my experience with Colab has been quite good. Most often had 11G memory to myself.

Also, @wgpubs Colab is backed by a K80 GPU similar to a p2.xlarge instance. So, the speed is similar. However, for faster training times, you’d like a p3.xlarge (which has 16G RAM).

EDIT: Refer this link for additional details: https://stackoverflow.com/questions/48750199/google-colaboratory-misleading-information-about-its-gpu-only-5-ram-available

(Nikhil B ) #9

Yeah, a good place to get familiar with the notebook features before spinning up the hourly paperspace /aws.

(Jeremy Howard) #10

Note that the version of fastai in pip is pretty old at the moment.

(Phani Srikanth) #11

A nifty little improvement to the script to address the older version of the library on pip is to make this change:
! pip install https://github.com/fastai/fastai/archive/master.zip instead of ! pip install fastai

(Sourabh Dattawad) #12

Thank you for your suggestion! I have edited the script.

(Mandar Deshpande) #13

Thank you @sourabhd for this resource!
Really useful to first work through the notebook once, before powering up Paperspace.

I think it would be a good first step to follow after each week’s lesson! :smile:

(Sourabh Dattawad) #14

I’m glad it helped. After using Google colab you will never look back at Paperspace! :rofl:

(Jeremy Howard) #15

The Paperspace GPUs are much faster - although the price isn’t so good!..

(Sourabh Dattawad) #16

Definitely @jeremy ! Here in India most of the students cannot afford to pay. Even personalized GPUs are very costly. I have done my entire Part 1 of fast.ai on Google colab and it didn’t disappoint me.
I really enjoyed the first live session. Thank you so much for making a wonderful course!

(Arnav) #17

Hey @sourabhd. I was initially trying to implement whatever I can in Colab as well but somewhere around lesson 4 or 5, it just started taking too long and memory issues were very annoying. What I wanted to know was, how long did training models (say the language model in lesson 4) take for you? Did it continue running for that long without interruption?


Hello @sourabhd It is my first time to use Google colab
data file is not found when i run the last three lines

    tar: data/VOCtrainval_06-Nov-2007.tar: Cannot open: No such file or directory
    tar: Error is not recoverable: exiting now
    unzip:  cannot find or open data/PASCAL_VOC.zip, data/PASCAL_VOC.zip.zip or data/PASCAL_VOC.zip.ZIP.

Where have i gone wrong?


(Mandar Deshpande) #20

Hi Nandutu,

I just figured out that the wget command is downloading the tar files in the “root” directory, instead of the “data” directory.

Simply change the the last 2 lines to the following:

!tar -xf VOCtrainval_06-Nov-2007.tar -C data/
!unzip PASCAL_VOC.zip -d data/

Also once you have extracted everything, you will need to move the PASCAL_VOC files to the data/ directory to run the notebook as expected.

!mv data/PASCAL_VOC/* data/

No need to change anything else other than this.
It will work as expected then :smile:

(Sourabh Dattawad) #21

I had used -d symbol instead of -P for the path. I have edited the code and it should work now!