Google Colab quitely turn on TPU v2 for FREE to everyone


(Cedric Chee) #1

Looks like Google quietly turn on free TPU v2 for Google Colab 2 days ago. I first got to know about this from yesterday’s TensorFlow and Deep Learning group meetup in Singapore with Google’s engineers talking about TensorFlow 2.0. GPU hardware accelerator is still available. If I heard correctly, this will be available for a limited time only. I have tested this by trying to train a simple MNIST model and it works for me in this part of the world. I am planning to do a quick benchmark and compare the performance with GPU later.

Note that TPU v3 is in Alpha testing and not available to the everyone yet.

See my Twitter thread for more details.


1. Updated on 2018-09-28 12:36 PM GMT+8:

Slides and the notebook for samwit’s talk, “Get training in TensorFlow Keras on TPUs for free!!”:

https://colab.research.google.com/drive/1F8txK1JLXKtAkcvSRQz2o7NSTNoksuU2

2. Updated on 2018-09-28 17:30 PM GMT+8:

MNIST benchmark done. Obviously training MNIST on a TPU is a bit overkill and the TPU barely gets a chance to warm up. :slight_smile: One epoch took around 3s!!!

My notebook: https://nbviewer.jupyter.org/github/cedrickchee/data-science-notebooks/blob/master/notebooks/tensorflow/google_cloud_tpu/guide_to_tensorflow_keras_on_tpu_mnist.ipynb


Paperspace setup help
#2

I appreciate the heads up. Thank you.


(pascal louis-marie) #3

Hi,

That’s an amazing news I have also just discovered : TPU for free is like a dream that comes true.
Then I try to test it out: colab CPU vs GPU vs TPU.
And training time for 12 epochs of mnist dataset is almost the same between GPU and TPU.

CPU 12 epochs mnist training - 118s :
https://colab.research.google.com/drive/1btld1Qk3V57FdpyUKnzLe-oDKNh16kAk

GPU 12 epochs mnist training - 45.5s :
https://colab.research.google.com/drive/1rawejJ21j-rN8HVG584hFFHkOrSzLCCy

TPU 12 epochs mnist training - 43.2s :
https://colab.research.google.com/drive/1YJJHo3pvT8d2MzYr1ptypWBb5mv1zvG8

Unless I miss something here, that’s really unexpected.


(Cedric Chee) #4

Hi,

I can’t view the Colab notebooks. This is what I see:

Screenshot%20from%202018-10-08%2023-45-10

I suggest you either make the notebook available to everyone or grant access to my Google Account.

But without looking at your code, I assumed you are using Keras + TensorFlow and if that’s the case, I suspect you are experiencing the slowness because you are doing the exact same thing (train/fit model) without using tf.data for the data pipeline.


(pascal louis-marie) #5

Cedric,
Thank for your quick feedback. Each colab page should be available now.
Let me know if otherwise.

Actually in my benchmark I have also 3s by epoch with TPU, so that matches your own result.
But comparing with GPU, it is 4s by epoch.

I was expecting a bigger gap,
as demonstrated here : https://blog.riseml.com/benchmarking-googles-new-tpuv2-121c03b71384

On a side note, I experimented that using a smaller batch size increase performance on TPU
[ training over 12 epochs - mnist dataset]
bc=1024 -> 53s training time ; bc=128 -> 43s training time