Stair stepping descent with fit-flat-cos working really well

I’ve been doing a lot of model training recently and something I’ve found that seems to work better than just Ranger + fit_flat_cos, is what I’ll term stair stepping descent with fit_flat_cos.

Like this:
1 - 3 @ 8e-3, fit-flat-cos
2 - 5 @ 1e-3, fit-flat-cos
3 - 8 @ 8e-4, fit-flat-cos

etc. Basically a small run with flat and slide…then another run with flat and slide…with lr decay on each.

Basically, I theorize that the constant run and drop, run and drop, helps it to steadily work it’s way down into a nice valley vs if we run it as a longer single fit-flat-cos.

I’ll keep testing with it and hopefully make it into a single callback (feel free to beat me to it :slight_smile: but just wanted to throw it out there that this approach seems to be working quite well on my private datasets.

3 Likes

How does it work with ImageWoof? :wink:

1 Like

We`ll find out tomorrow :slight_smile:

I can match it (fit_flat_cos) but haven’t beaten it yet.

One difference though is I’m using EfficientNet for work, vs this is XResNet.

I’ll have to see if that makes a difference.