Google Colab FastAI Setup

To use Google Colab with FastAI v1, set Runtime->Set runtime type to GPU

Then run the following code in Colab to install updated versions of PyTorch and Fastai.

# http://pytorch.org/
from os.path import exists
from wheel.pep425tags import get_abbr_impl, get_impl_ver, get_abi_tag
platform = '{}{}-{}'.format(get_abbr_impl(), get_impl_ver(), get_abi_tag())
cuda_output = !ldconfig -p|grep cudart.so|sed -e 's/.*\.\([0-9]*\)\.\([0-9]*\)$/cu\1\2/'
accelerator = cuda_output[0] if exists('/dev/nvidia0') else 'cpu'
!pip install torch_nightly -f https://download.pytorch.org/whl/nightly/{accelerator}/torch_nightly.html
  
import torch
print(torch.__version__)
print(torch.cuda.is_available())
print(torch.backends.cudnn.enabled)

!pip install fastai

import fastai
from fastai import *
from fastai.vision import *

This should work fine with MNIST, assuming you got the full GPU RAM instance. It works fine for me. But it doesn’t work with Dogs and Cats. Anyone know of a way to get it to work?

# dogs and cats example code 
path = untar_data(URLs.DOGS)
print(path)

data = ImageDataBunch.from_folder(
    path, 
    ds_tfms=get_transforms(), 
    tfms=imagenet_norm, 
    size=224
 )
img,label = data.valid_ds[-1]
img.show(title=data.classes[label])

learn = ConvLearner(data, models.resnet34, metrics=accuracy)
learn.fit_one_cycle(1)

Error:
RuntimeError: DataLoader worker (pid 216) is killed by signal: Bus error.

Checking RAM with
import psutil
import humanize
import os
import GPUtil as GPU

GPUs = GPU.getGPUs()

# XXX: only one GPU on Colab and isn’t guaranteed
gpu = GPUs[0]

def printm():
 process = psutil.Process(os.getpid())
 print("Gen RAM Free: " + humanize.naturalsize( psutil.virtual_memory().available ), " | Proc size: " + humanize.naturalsize( process.memory_info().rss))
 print("GPU RAM Free: {0:.0f}MB | Used: {1:.0f}MB | Util {2:3.0f}% | Total {3:.0f}MB".format(gpu.memoryFree, gpu.memoryUsed, gpu.memoryUtil*100, gpu.memoryTotal))
printm()

yields:

Gen RAM Free: 11.8 GB | Proc size: 1.9 GB
GPU RAM Free: 10339MB | Used: 1102MB | Util 10% | Total 11441MB

So I’m getting the full Colab GPU RAM allotment.

SO and PyTorch forums don’t directly address the issue, but it seems like it’s a memory allocation issue. Setting num_workers = 0 might help, but I don’t see where to do that. Anyone have any ideas?

2 Likes

Dogs and cats runs on Colab if I dropping size to 20, but it doesn’t perform very well then.

size=20

1 Like

resolved by setting num_workers=0 in ImageDataBunch.from_folder()

Now can run with size = 256 on resnet50
Accuracy > 99% after fit_one_cycle(1)

# dogs and cats example code 
path = untar_data(URLs.DOGS)
print(path)

data = ImageDataBunch.from_folder(
    path, 
    ds_tfms=get_transforms(), 
    tfms=imagenet_norm, 
    num_workers=0,
    size=224
 )
img,label = data.valid_ds[-1]
img.show(title=data.classes[label])

learn = ConvLearner(data, models.resnet34, metrics=accuracy)
learn.fit_one_cycle(1)
3 Likes

Here is a link to the notebook that I could successfully run on Google Colab.
fast.ai v1 (Course v3) Lesson 1:

The problems you might encounter are:

  1. Version of the Pytorch
  2. Allocated GPU Memory is low. To solve this issue, experiment with the batch size (bs). With 64M of allocated memory, bs=10 worked for me.

In case you run out of memory while experimenting with certain values, you would have to restart your Kernel to try out new values.

1 Like

The server setup of colab (and many others) has been simplified and it can be looked up here:
https://course-v3.fast.ai/start_colab.html

With just one single line of code, the colab environment can be setup now.

1 Like

The Colab team also says the a fix is coming out for the memory issue. https://github.com/googlecolab/colabtools/issues/329

2 Likes

This is great!

Its super confusing with multiple different versions of the courses refering to multiple different versions of the libraries. I am running in circles, watching a lecture, trying to run the code, which fails, which has all documentation pointing to a version of the library that the lectures and notebooks don’t use.

Anyway – is there a similar script to get collab setup that works with the lectures that are available?

For example, after running the above script and trying to run the lecture4-imdb notebook, when trying to import

ModuleNotFoundError: No module named ‘fastai.learner’

Which is (I believe) because the lectures and corresponding notebooks use 0.7.0 of fastai.

However, if I just force install fastai 0.7.0 then the version of torch installed is incomptaible with fastai,… and on and on with a mobius strip of dependency hell.

2 Likes

hi @eof
Any luck with getting older Part 1 notebook files to work on Colab?

Currently I’ve got my colab notebook using 0.7.0 of fastai

But I think I’ve got the wrong version of pytorch installed. Any idea which version we should install to get it running?

The setup files above look like they’re good to go with FastAI v1, but I’d like to use the older fastai API.

I did have it working for the first few notebooks but wasn’t able to distill it to a single cell for bootstrapping. I eventually got into probably the same problem with you, mismatching pytorch and fastai versions.

Since then, I have spent time reading docs.fast.ai and, lecture 5 especially helped me understand the underlying pieces to slowly work through stuff with fastai1.

Essentially I have abandoned the older notebook versions, and I recommend you do too :slight_smile:

1 Like

Thanks @eof
I think I shall do the same :slight_smile:

sample notebook to install fastai library as well as download data from kaggle using kaggle api.