Cosine annealing vs triangular policy cyclic learning rate

I am a fast.ai student currently working on Lesson 3 of Part 1, but I am interested in getting my hands dirty and work on a Kaggle competition, the Human Protein Atlas competition. Looking at thiskernel I saw the use_clr argument. It looks like it is for a CyclicLR class, which implements are triangular policy for cyclic learning rate. When is this better than using cosine annealing for cyclic learning rates? If this is addressed in later lessons, could you please let me know which one?

(Also, this is my first post here, if this is the incorrect place, please let me know and I will move it to a different category)

1 Like