Batch size finder from OpenAI implemented using Fastai

DanyWin · November 1, 2019, 4:31pm

Hello there !

Following Jeremy Howard post on Twitter, I went and tried to implement the paper https://arxiv.org/pdf/1812.06162.pdf

It did not take too much time to realize that the first time I read the paper I barely looked at the result, and it was not the easiest paper to understand, as they do not give much details about the math behind, neither do they explain clearly their implementation.

I ran into a problem when trying to implement this : it seems that they assume that we have a multi GPU used to compute the statistics we are looking for.

I have tried three different ways to solve this problem but none seems to really work.

I have written a little Colab Notebook to explain better my work https://colab.research.google.com/drive/15lTG_r03yqSwShZ0JO4XaoWixLMXMEmv, and if anyone is interested I would be glad to try to solve this problem together

DanyWin · November 4, 2019, 4:19pm

Actually I found the solution on my own, managed to implement the paper and did a little testing on Rossmann’s stores, and found a 4x speedup using a bs of 512 instead of 64 !

I wrote a Medium article about this to share the results, would be glad to hear it from you

muellerzr · November 4, 2019, 4:38pm

Well done! I can’t wait to try this out!

DanyWin · November 4, 2019, 4:50pm

Thank you ! It’s a bit tricky to use, as there is a lot of instability of the curve, as there is a lot of approximation. My take on this is to increase beta from 0.99 to 0.999 for example if the curve is too bumpy.

But please keep me updated
I think it works rather well on Tabular data at least, but will have to try with Images and NLP but you might run into CUDA Out of memory in any case if you use too big bs ^^

muellerzr · November 4, 2019, 4:54pm

A lot of my work is with tabular so I’ll keep you updated how it pans on my datasets!

nestorDemeure · March 9, 2020, 10:45am

This seems great! I just added a link in the unofficial fastai extensions repository

@DanyWin, could you encapsulate your work in a bs_finder type of callback and put it in a github repository ? That would make it much easier to test / reuse.

yathindra · February 4, 2021, 4:30am

Has anyone tried batch size finder for ULMFit?