Looks like Google quietly turn on free TPU v2 for Google Colab 2 days ago. I first got to know about this from yesterday’s TensorFlow and Deep Learning group meetup in Singapore with Google’s engineers talking about TensorFlow 2.0. GPU hardware accelerator is still available. If I heard correctly, this will be available for a limited time only. I have tested this by trying to train a simple MNIST model and it works for me in this part of the world. I am planning to do a quick benchmark and compare the performance with GPU later.
Note that TPU v3 is in Alpha testing and not available to the everyone yet.
See my Twitter thread for more details.
1. Updated on 2018-09-28 12:36 PM GMT+8:
Slides and the notebook for samwit’s talk, “Get training in TensorFlow Keras on TPUs for free!!”:
2. Updated on 2018-09-28 17:30 PM GMT+8:
MNIST benchmark done. Obviously training MNIST on a TPU is a bit overkill and the TPU barely gets a chance to warm up. One epoch took around 3s!!!
My notebook: https://nbviewer.jupyter.org/github/cedrickchee/data-science-notebooks/blob/master/notebooks/tensorflow/google_cloud_tpu/guide_to_tensorflow_keras_on_tpu_mnist.ipynb
I appreciate the heads up. Thank you.
That’s an amazing news I have also just discovered : TPU for free is like a dream that comes true.
Then I try to test it out: colab CPU vs GPU vs TPU.
And training time for 12 epochs of mnist dataset is almost the same between GPU and TPU.
CPU 12 epochs mnist training - 118s :
GPU 12 epochs mnist training - 45.5s :
TPU 12 epochs mnist training - 43.2s :
Unless I miss something here, that’s really unexpected.
I can’t view the Colab notebooks. This is what I see:
I suggest you either make the notebook available to everyone or grant access to my Google Account.
But without looking at your code, I assumed you are using Keras + TensorFlow and if that’s the case, I suspect you are experiencing the slowness because you are doing the exact same thing (train/fit model) without using
tf.data for the data pipeline.
Thank for your quick feedback. Each colab page should be available now.
Let me know if otherwise.
Actually in my benchmark I have also 3s by epoch with TPU, so that matches your own result.
But comparing with GPU, it is 4s by epoch.
I was expecting a bigger gap,
as demonstrated here : https://blog.riseml.com/benchmarking-googles-new-tpuv2-121c03b71384
On a side note, I experimented that using a smaller batch size increase performance on TPU
[ training over 12 epochs - mnist dataset]
bc=1024 -> 53s training time ; bc=128 -> 43s training time
So anyone knows what’s the deal yet? Are we getting a Poor Man’s version of TPUs or what?