I’ve been tweaking my learning rates after a few epochs using model.optimizer.lr = new_lr and model.optimizer.lr.set_value(new_lr). If I use the former my results after a few more epochs line up pretty well with Jeremy’s. If I use the .set_value method, the “proper” way according to this github thread, I get different, much worse results for the same number of epochs.
Another method I haven’t tried is to use the Keras backend k to update the learning rate: k.set_value(self.model.optimizer.lr, new_lr).
Any idea what’s going on behind the scenes that would make these methods behave differently?
Wow - wacky!.. I’ve been wondering about something similar over the last few days, which is: why does the way I’ve been doing it work at all? I’m kind of surprised it does…
Could you show an example notebook in a gist that demonstrates the two different behaviors (and in particular, the worse behavior when doing set_value())? You should probably also show what happens when you don’t change the learning rate at all, so we have a control group.
We may need to ask on the keras mailing list if we can’t figure it out ourselves…
Curses! good catch though. What’s the result look like if you don’t set the learning rate at all? Also, 0.001 is the default, so maybe try two non-default values, for robustness.
In any case, it’s still unclear why there’s a difference between results for set_value vs. =. Can you even reproduce that on your end?