How to choose pct_start in OneCycleScheduler?

This post explains what pct_start means very well. But I have trouble finding how to choose pct_start. What impact does pct_start have on model training? Why was it set to a different value from the default 0.3 in these notebooks:


Through trial and error :wink:
More seriously, sometimes initial training requires a really long warm_up (hence the 0.9 in the first notebook) and on contrary, fine-tuning often barely requires a warmup at all (hence the 0.1 in the second).


This makes a lot of sense. Thanks!