Understand cosine annealer

champs.jaideep · April 27, 2019, 4:11pm

hi can some one help understand below things about the consin annealer

 def sched_cos(start, end, pos): 
return start + (1 + math.cos(math.pi*(1-pos))) * (end-start) / 2

what is the purpose of pos here actual implementation put Tcur/Tmax as per pytorch documentation
https://pytorch.org/docs/stable/optim.html
why do we use 1-pos
how does the lr varries using this function ,min value and the max value.

Kaspar · April 27, 2019, 4:35pm

pos is the progress ratio in the one_cycle going from 0 --> 1.
end-start: is the variation of the parameter which is then offset by start

Fx momentum starting at start=0.85, and maximum end=0.95
try this pesudo code
fx=[sched_cos(start=0.85, end=0.95, pos) for pos in range(0,1.01,0.01)]
import matplotlib as plt
plt.plot(fx)

champs.jaideep · May 3, 2019, 8:41am

Thanks few more qs

Which part here does the role of warm restarts as required by SGDR with warm restart. pytorch’s Cosine Ann still dsnt has it i guess ?
why we divided end-start by 2 .

Sorry for many basic q

swagman · May 5, 2019, 7:15am

we want half-waves in the cosine annealer, hence we divide by 2

swagman · May 5, 2019, 7:21am

For the answer of your first question, the warm restart happens because of the half wave, since the wave is halved, for the next oscillation it begins again from (end-start) instead of 0. That sudden shoot up is the warm up.

swagman · May 5, 2019, 7:22am

Please check the figures in this paper https://arxiv.org/pdf/1608.03983.pdf