Meet RAdam - imo the new state of the art AI optimizer

I wrote a post where I explain why it looks like current baseline on the leaderboard is lower than it should be. It wasn’t quite about number of GPUs after all.

1 Like