Colaboratory and Fastai


(Amrit ) #1

I really like Colabotary as its free, uses a GPU (from what I have read) and uses a jupyter book like environment and wanted to see if anyone else was using it for Fastai. I have managed to load the required dependencies and wanted to start a thread we can use for trouble shooting.

Has anyone here successfully used this to train?

Here is the link to Colabotary: https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=0ahUKEwi83e7WuujYAhUQXK0KHU25DDYQFggrMAA&url=https%3A%2F%2Fcolab.research.google.com%2F&usg=AOvVaw3A5aPK2kLFzKOzb6sOckVw

Here is a list of additional dependencies I had to install.

!pip install fastai
!pip install opencv-python
!apt update && apt install -y libsm6 libxext6
!pip3 install http://download.pytorch.org/whl/cu80/torch-0.3.0.post4-cp36-cp36m-linux_x86_64.whl 
!pip3 install torchvision

For uploading files and linking to your google drive - although having issues using the files once uploaded - probably simple directory issue

from google.colab import files

uploaded = files.upload()

for fn in uploaded.keys():
  print('User uploaded file "{name}" with length {length} bytes'.format(
      name=fn, length=len(uploaded[fn])))

Linking to your Google cloud account

!pip install -U -q PyDrive
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials

auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

Struggles with completing lesson 1 on Google Colab due to its GPU limitation
Google Colab Setup for FastAI Part 2 v2
(ecdrid) #2

Will try this today…


(Amrit ) #3

@ecdrid that would be great! Having issues understanding the directory process, for example uploaded labels.csv but don’t know exactly how to access it in the notebook


(Manikanta Yadunanda Sangu) #4

I have successfully trained lesson 1 on Google Colab with GPU. I will put up a post by evening.


(Amrit ) #5

@manikanta_s

that would be awesome!


(ecdrid) #6

I think we need to use python OS module

import os
dir= os.listdir("Path")
Have a look whether you can find your file out there

Also if we are getting a terminal then ls might help or pwd

Will explore it further…


(ecdrid) #7

@amritv
https://colab.research.google.com/notebook#fileId=1srw_HFWQ2SMgmWIawucXfusGzrj1_U0q


(Amrit ) #8

@ecdrid
Thanks alot for this will work on it in the morning!


(john v) #9

How does sharing work with the 12 hours free GPU credit? If I create a notebook and share it with someone can they get a copy and run it independently of me? I want to teach some people but they need to be able to use it independently.


(Manikanta Yadunanda Sangu) #10

Hi, Please find my post on the same. I am facing issues for while training the full network with SGDR and differential learning rates. I am exploring and will be updating the post accordingly.


(Manikanta Yadunanda Sangu) #11

Yes, They can copy that on to there cloud. It works similar to Google Docs.


(Amrit ) #12

@manikanta_s thanks for sharing this!


(Amrit ) #13

@ecdrid thanks for this! now able to connect to my gdrive.


(ecdrid) #14

Yep…
Forgot to update here,
It’s good but uploading is costly…


(Moustapha Cheikh) #15

hello everyone
I uploaded the dataset to google drive and connected my account to google colab and plot some images.

but when i try to train the classifier i get this error


(ecdrid) #16

You are passing the suffix parameter?


(Cedric Chee) #17

Hey everyone! I hope you are enjoying Google Colab as much as I do. I would like to share some tips to make Google Colab easier to use day-to-day. Here’s some convenient ways to upload a folder of files like large data sets from your local computer.

  1. You can save and load files directly from Google Drive by mounting your Google Drive as a FUSE filesystem. A recipe for FUSE-mounting Drive: https://stackoverflow.com/questions/47374971/from-colab-directly-manipulate-sqlite3-format-data-in-google-drive

  2. File synchronization. Here’s an example of downloading files from a Colab backend. That notebook also has other examples for I/O that may be helpful, example, copying files to Google Drive or Google Cloud Storage. https://colab.research.google.com/notebook#fileId=/v2/external/notebooks/io.ipynb

Source: https://www.kaggle.com/getting-started/47096#273889


(Rahul Kumaresan) #18

Thank you @amritv, I was having a hard time setting up the environment on Colaboratory, this post is all I was looking for.


(Amrit ) #19

@cedric thanks for sharing this!


(Jacob Pettit) #20

Hi all,

I’m trying to get started doing the first lesson of the course, and I’m having issues with the Google Colaboratory environment. I’ve gotten the necessary packages installed, but when I try to run the pre-trained model, I’m given an error:

RuntimeError: cuda runtime error (2) : out of memory at /pytorch/torch/lib/THC/generic/THCStorage.cu:58

I’ve tried reinstalling packages and updating/downgrading things, and have tried restarting the VM. I also tried ‘rm -rf {PATH}tmp’ which didn’t help. I’ll link my notebook below in case anyone wants to have a look, thanks for any help!

Best,

Jacob

UPDATE: 1/28/18 @ 8:47 PM

Running

! cat /proc/meminfo

returns

MemTotal: 13341960 kB
MemFree: 196212 kB
MemAvailable: 10361760 kB
Buffers: 663512 kB
Cached: 8932100 kB
SwapCached: 0 kB
Active: 6727808 kB
Inactive: 4949184 kB
Active(anon): 2240776 kB
Inactive(anon): 121940 kB
Active(file): 4487032 kB
Inactive(file): 4827244 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 532 kB
Writeback: 0 kB
AnonPages: 2081416 kB
Mapped: 536036 kB
Shmem: 281344 kB
Slab: 1144408 kB
SReclaimable: 1104688 kB
SUnreclaim: 39720 kB
KernelStack: 4160 kB
PageTables: 14820 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 6670980 kB
Committed_AS: 4439864 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 0 kB
VmallocChunk: 0 kB
AnonHugePages: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 341964 kB
DirectMap2M: 9095168 kB
DirectMap1G: 6291456 kB

To test running on GPU I did PyTorch’s very beginning tutorial code, which has you perform matrix addition on the GPU. I still got the out of memory error for that commands, the above is output on the notebook where I did the first PyTorch tutorial, so should not have been an issue but it was.

UPDATE 1/29/18 @ 5:25 PM

Solved the issue, on Google Colab when you accelerate your notebook using GPU, every bit of code you run in the cell is automatically compiled and sent to GPU. Calling functions in the code such as .cuda() in pytorch, which compile specific items for GPU, leads to an error because you are essentially trying to double compile something, which doesn’t work. I had to fork the fastai GitHub repository and edit the convnet file so that the function ‘to_gpu()’ is removed and that fixed the issue. I’m anticipating needing to make such modifications to each lesson’s code as I go through them, and I intend to share all of the code I change at the end.