Colaboratory and Fastai

amritv · January 21, 2018, 6:57am

I really like Colabotary as its free, uses a GPU (from what I have read) and uses a jupyter book like environment and wanted to see if anyone else was using it for Fastai. I have managed to load the required dependencies and wanted to start a thread we can use for trouble shooting.

Has anyone here successfully used this to train?

Here is the link to Colabotary: https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=0ahUKEwi83e7WuujYAhUQXK0KHU25DDYQFggrMAA&url=https%3A%2F%2Fcolab.research.google.com%2F&usg=AOvVaw3A5aPK2kLFzKOzb6sOckVw

Here is a list of additional dependencies I had to install.

!pip install fastai
!pip install opencv-python
!apt update && apt install -y libsm6 libxext6
!pip3 install http://download.pytorch.org/whl/cu80/torch-0.3.0.post4-cp36-cp36m-linux_x86_64.whl 
!pip3 install torchvision

For uploading files and linking to your google drive - although having issues using the files once uploaded - probably simple directory issue

from google.colab import files

uploaded = files.upload()

for fn in uploaded.keys():
  print('User uploaded file "{name}" with length {length} bytes'.format(
      name=fn, length=len(uploaded[fn])))

Linking to your Google cloud account

!pip install -U -q PyDrive
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials

auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

ecdrid · January 21, 2018, 7:03am

Will try this today…

amritv · January 21, 2018, 7:15am

@ecdrid that would be great! Having issues understanding the directory process, for example uploaded labels.csv but don’t know exactly how to access it in the notebook

manikanta_s · January 21, 2018, 7:31am

I have successfully trained lesson 1 on Google Colab with GPU. I will put up a post by evening.

amritv · January 21, 2018, 7:36am

@manikanta_s

that would be awesome!

ecdrid · January 21, 2018, 7:37am

I think we need to use python OS module…

import os
dir= os.listdir("Path")
Have a look whether you can find your file out there

Also if we are getting a terminal then ls might help or pwd

Will explore it further…

ecdrid · January 21, 2018, 7:45am

@amritv
https://colab.research.google.com/notebook#fileId=1srw_HFWQ2SMgmWIawucXfusGzrj1_U0q

amritv · January 21, 2018, 7:57am

@ecdrid
Thanks alot for this will work on it in the morning!

johnnyv · January 21, 2018, 9:01am

How does sharing work with the 12 hours free GPU credit? If I create a notebook and share it with someone can they get a copy and run it independently of me? I want to teach some people but they need to be able to use it independently.

manikanta_s · January 21, 2018, 10:12am

Hi, Please find my post on the same. I am facing issues for while training the full network with SGDR and differential learning rates. I am exploring and will be updating the post accordingly.

manikanta_s · January 21, 2018, 10:19am

Yes, They can copy that on to there cloud. It works similar to Google Docs.

amritv · January 21, 2018, 3:59pm

@manikanta_s thanks for sharing this!

amritv · January 22, 2018, 6:27am

@ecdrid thanks for this! now able to connect to my gdrive.

ecdrid · January 22, 2018, 6:47am

Yep…
Forgot to update here,
It’s good but uploading is costly…

devcheikh · January 22, 2018, 2:09pm

hello everyone
I uploaded the dataset to google drive and connected my account to google colab and plot some images.

but when i try to train the classifier i get this error

ecdrid · January 22, 2018, 3:23pm

You are passing the suffix parameter?

cedric · January 28, 2018, 4:17am

Hey everyone! I hope you are enjoying Google Colab as much as I do. I would like to share some tips to make Google Colab easier to use day-to-day. Here’s some convenient ways to upload a folder of files like large data sets from your local computer.

You can save and load files directly from Google Drive by mounting your Google Drive as a FUSE filesystem. A recipe for FUSE-mounting Drive: https://stackoverflow.com/questions/47374971/from-colab-directly-manipulate-sqlite3-format-data-in-google-drive
File synchronization. Here’s an example of downloading files from a Colab backend. That notebook also has other examples for I/O that may be helpful, example, copying files to Google Drive or Google Cloud Storage. https://colab.research.google.com/notebook#fileId=/v2/external/notebooks/io.ipynb

Source: https://www.kaggle.com/getting-started/47096#273889

rahul-kumi · January 28, 2018, 5:25am

Thank you @amritv, I was having a hard time setting up the environment on Colaboratory, this post is all I was looking for.

amritv · January 28, 2018, 6:59am

@cedric thanks for sharing this!

jfpettit · January 29, 2018, 1:04am

Hi all,

I’m trying to get started doing the first lesson of the course, and I’m having issues with the Google Colaboratory environment. I’ve gotten the necessary packages installed, but when I try to run the pre-trained model, I’m given an error:

RuntimeError: cuda runtime error (2) : out of memory at /pytorch/torch/lib/THC/generic/THCStorage.cu:58

I’ve tried reinstalling packages and updating/downgrading things, and have tried restarting the VM. I also tried ‘rm -rf {PATH}tmp’ which didn’t help. I’ll link my notebook below in case anyone wants to have a look, thanks for any help!

Best,

Jacob

UPDATE: 1/28/18 @ 8:47 PM

Running

! cat /proc/meminfo

returns

MemTotal: 13341960 kB
MemFree: 196212 kB
MemAvailable: 10361760 kB
Buffers: 663512 kB
Cached: 8932100 kB
SwapCached: 0 kB
Active: 6727808 kB
Inactive: 4949184 kB
Active(anon): 2240776 kB
Inactive(anon): 121940 kB
Active(file): 4487032 kB
Inactive(file): 4827244 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 532 kB
Writeback: 0 kB
AnonPages: 2081416 kB
Mapped: 536036 kB
Shmem: 281344 kB
Slab: 1144408 kB
SReclaimable: 1104688 kB
SUnreclaim: 39720 kB
KernelStack: 4160 kB
PageTables: 14820 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 6670980 kB
Committed_AS: 4439864 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 0 kB
VmallocChunk: 0 kB
AnonHugePages: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 341964 kB
DirectMap2M: 9095168 kB
DirectMap1G: 6291456 kB

To test running on GPU I did PyTorch’s very beginning tutorial code, which has you perform matrix addition on the GPU. I still got the out of memory error for that commands, the above is output on the notebook where I did the first PyTorch tutorial, so should not have been an issue but it was.

UPDATE 1/29/18 @ 5:25 PM

Solved the issue, on Google Colab when you accelerate your notebook using GPU, every bit of code you run in the cell is automatically compiled and sent to GPU. Calling functions in the code such as .cuda() in pytorch, which compile specific items for GPU, leads to an error because you are essentially trying to double compile something, which doesn’t work. I had to fork the fastai GitHub repository and edit the convnet file so that the function ‘to_gpu()’ is removed and that fixed the issue. I’m anticipating needing to make such modifications to each lesson’s code as I go through them, and I intend to share all of the code I change at the end.