Which of the HuggingFace optimizer schedulers are already implemented in fastai?

The pytorch-transformers library makes the following schedulers available via its API:

  • none
  • warmup_constant
  • warmup_linear
  • warmup_cosine
  • warmup_cosine_hard_restarts
  • warmup_cosine_warmup_restarts

I thought I saw at least a few of these implemented in an article Sylvain wrote, but I can’t find it … nor do I see any of these implemented via the docs (though I can’t help but think they might be somewhere).

Anyhow, don’t want to repeat work that has already been done, so if any of you know of the existence of these schedulers in the form of fastai callbacks I’d definitely appreciate knowing where to look :slight_smile:

You can implement anything easily with a general scheduler.

2 Likes

In your fit_sgd_warm example, is there any particular reason you are including mom as a parameter?

The hyperparameter isn’t changing so I was just wondering why it is included vs. just assigning learn.opt_func = an optimization function with momentum set there.

… or, is it a best practice to include things like lr, mom, beta, eps, wd in our custom schedulers so it is easy for folks to change these params without having to define learn.opt_func?

It’s easier to change it this way than setting a new partial optim function, that’s why I included it like this.

1 Like