Stair stepping descent with fit-flat-cos working really well

LessW2020 · February 8, 2020, 12:19am

I’ve been doing a lot of model training recently and something I’ve found that seems to work better than just Ranger + fit_flat_cos, is what I’ll term stair stepping descent with fit_flat_cos.

Like this:
1 - 3 @ 8e-3, fit-flat-cos
2 - 5 @ 1e-3, fit-flat-cos
3 - 8 @ 8e-4, fit-flat-cos

etc. Basically a small run with flat and slide…then another run with flat and slide…with lr decay on each.

Basically, I theorize that the constant run and drop, run and drop, helps it to steadily work it’s way down into a nice valley vs if we run it as a longer single fit-flat-cos.

I’ll keep testing with it and hopefully make it into a single callback (feel free to beat me to it but just wanted to throw it out there that this approach seems to be working quite well on my private datasets.

muellerzr · February 8, 2020, 12:37am

How does it work with ImageWoof?

LessW2020 · February 8, 2020, 12:44am

We`ll find out tomorrow

LessW2020 · February 8, 2020, 6:21am

I can match it (fit_flat_cos) but haven’t beaten it yet.

One difference though is I’m using EfficientNet for work, vs this is XResNet.

I’ll have to see if that makes a difference.