Understand cosine annealer

hi can some one help understand below things about the consin annealer

 def sched_cos(start, end, pos): 
return start + (1 + math.cos(math.pi*(1-pos))) * (end-start) / 2
  1. what is the purpose of pos here actual implementation put Tcur/Tmax as per pytorch documentation
    https://pytorch.org/docs/stable/optim.html

  2. why do we use 1-pos

  3. how does the lr varries using this function ,min value and the max value.

1 Like

pos is the progress ratio in the one_cycle going from 0 --> 1.
end-start: is the variation of the parameter which is then offset by start

Fx momentum starting at start=0.85, and maximum end=0.95
try this pesudo code
fx=[sched_cos(start=0.85, end=0.95, pos) for pos in range(0,1.01,0.01)]
import matplotlib as plt
plt.plot(fx)

1 Like

Thanks few more qs

  1. Which part here does the role of warm restarts as required by SGDR with warm restart. pytorch’s Cosine Ann still dsnt has it i guess ?

  2. why we divided end-start by 2 .

Sorry for many basic q :slight_smile:

we want half-waves in the cosine annealer, hence we divide by 2

For the answer of your first question, the warm restart happens because of the half wave, since the wave is halved, for the next oscillation it begins again from (end-start) instead of 0. That sudden shoot up is the warm up.

Please check the figures in this paper https://arxiv.org/pdf/1608.03983.pdf