Using the LR finder upon an unfrozen models

balnazzar · December 15, 2018, 8:14pm

I suspect I’m not genuinely grasping how the LR is supposed to be used.

If I’m not manking mistakes, even with an unfrozen model, the plot we get from it shows LR vs. Loss graph just about the final block added by fastai to the imagenet model.

If so, how the LR finder can be helpful as one gets to set the LRs for all the rest of layer groups?

balnazzar · December 17, 2018, 7:35am

Mates I don’t want to bump this, but I didn’t find anything relevant on the forum & the docs.

I believe that an answer would be enlightening for any practitioner.

Won’t issue further bumps though Thanks in advance!

marii · December 19, 2018, 9:34pm

I would say to test it out yourself. Would be a good learning experience to freeze all layers but the one you want to use “lr_finder” on and then do it yourself. Would require a bit of pytorch-y work, and a bit of rewriting the library code, but could give you a good idea of how the code works.

All in all though, the main thing we want is the highest learning rate without the model diverging, which is going to be determined by looking at the latter layers, since they are less related to our specific problem(how will identifying cat faces help me recognize numbers?). lr_finder gives us a “upper bound” on the learning rate for the final layers. We know the first layers will need a lower learning rate, but still want to allow it to be fairly high.

I think you have identified a piece of machine learning that we still do “by feel,” but the results have been fairly good up so far.

balnazzar · December 19, 2018, 9:55pm

Indeed, it’s tricky. Too low, and it’ll be almost like no unfreezing at all. Too high, and you’ll spoil your finely tuned imagenet weights.

If I’m not making mistakes, a guy subclassed a piece of library code in order to make it show a plot for every layer group. I’ll try and find it.