Lesson 1 - Non-beginner discussion

jwuphysics · March 20, 2020, 5:47pm

That’s really cool. And now that I think about it, most of the benefit I got using Ranger + fit_one_cycle comes from the RAdam part, and less so the LookAhead optimizer. So I might try running only RAdam + fit_one_cycle and see if I can get a speedup!

Currently I’m running some FastGarden tests and it looks like Ranger + fit_flat_cos blows the one-cycle learning policy out of the water (by ~5-8%).