Which of the HuggingFace optimizer schedulers are already implemented in fastai?

wgpubs · September 14, 2019, 11:27pm

The pytorch-transformers library makes the following schedulers available via its API:

none
warmup_constant
warmup_linear
warmup_cosine
warmup_cosine_hard_restarts
warmup_cosine_warmup_restarts

I thought I saw at least a few of these implemented in an article Sylvain wrote, but I can’t find it … nor do I see any of these implemented via the docs (though I can’t help but think they might be somewhere).

Anyhow, don’t want to repeat work that has already been done, so if any of you know of the existence of these schedulers in the form of fastai callbacks I’d definitely appreciate knowing where to look

sgugger · September 15, 2019, 12:06am

You can implement anything easily with a general scheduler.

wgpubs · September 16, 2019, 9:33pm

In your fit_sgd_warm example, is there any particular reason you are including mom as a parameter?

The hyperparameter isn’t changing so I was just wondering why it is included vs. just assigning learn.opt_func = an optimization function with momentum set there.

… or, is it a best practice to include things like lr, mom, beta, eps, wd in our custom schedulers so it is easy for folks to change these params without having to define learn.opt_func?

sgugger · September 16, 2019, 10:00pm

It’s easier to change it this way than setting a new partial optim function, that’s why I included it like this.