hi can some one help understand below things about the consin annealer
def sched_cos(start, end, pos):
return start + (1 + math.cos(math.pi*(1-pos))) * (end-start) / 2
what is the purpose of pos here actual implementation put Tcur/Tmax as per pytorch documentation
why do we use 1-pos
how does the lr varries using this function ,min value and the max value.
pos is the progress ratio in the one_cycle going from 0 --> 1.
end-start: is the variation of the parameter which is then offset by start
Fx momentum starting at start=0.85, and maximum end=0.95
try this pesudo code
fx=[sched_cos(start=0.85, end=0.95, pos) for pos in range(0,1.01,0.01)]
import matplotlib as plt
we want half-waves in the cosine annealer, hence we divide by 2
For the answer of your first question, the warm restart happens because of the half wave, since the wave is halved, for the next oscillation it begins again from (end-start) instead of 0. That sudden shoot up is the warm up.
Please check the figures in this paper https://arxiv.org/pdf/1608.03983.pdf