Mapping from v0.7 to 1.0 (use_clr_beta)

rother · April 5, 2019, 8:26am

Hi,
quick question if I understand the documentation correctly. In v0.7 I had a learner with use_clr_beta(10,10,0.95,0.85). I have translated this to learn.fit_one_cycle(1, lr, div_factor=10.0, pct_start=0.9, moms=(0.95,0.85)).

Is this correct? In the use_clr_beta article sgugger writes:

use_clr_beta takes two basic arguments, four if you want to add cyclical momentums. The first one is the ratio between the initial learning rate and the maximum one (typically 10 or 20). The second one is the percentage of the cycle you want to spend on the last part (on the picture 350 to 400), 10 seems to be a good pick once again.

And the documentation for fit_one_cycle has two arguments div_factor and pct_start that seem like good candidates. Unfortunately the arguments are not really explained in the docs. I interpreted it as div_factor being the ratio from use_clr_beta and pct_start being sort of the opposite of the second argument from use_clr_beta. 10 there means spend 10% on the last part which would translate to 0.9 pct_start which I interpret to mean “spend this percentage at the start”

Is that correct? (if not, any suggestions on how to map use_clr_beta correctly)

Edit: fixed typos

sgugger · April 5, 2019, 1:54pm

Yup, that is correct.

rother · June 11, 2019, 6:36am

Thx. In 0.7 we also had “Slanted Triangular Learning Rates” that had the parameters ratio and cut_fract. Can this be reproduced with one_cycle as well? If so how?

sgugger · June 11, 2019, 12:50pm

You will need to implement it with the GeneralScheduler, there is an example in the docs with SGDR.