Different training accuracy using model.optimizer.lr: .set_value vs. =

I’ve been tweaking my learning rates after a few epochs using model.optimizer.lr = new_lr and model.optimizer.lr.set_value(new_lr). If I use the former my results after a few more epochs line up pretty well with Jeremy’s. If I use the .set_value method, the “proper” way according to this github thread, I get different, much worse results for the same number of epochs.

Another method I haven’t tried is to use the Keras backend k to update the learning rate: k.set_value(self.model.optimizer.lr, new_lr).

Any idea what’s going on behind the scenes that would make these methods behave differently?

Wow - wacky!.. I’ve been wondering about something similar over the last few days, which is: why does the way I’ve been doing it work at all? I’m kind of surprised it does…

Could you show an example notebook in a gist that demonstrates the two different behaviors (and in particular, the worse behavior when doing set_value())? You should probably also show what happens when you don’t change the learning rate at all, so we have a control group.

We may need to ask on the keras mailing list if we can’t figure it out ourselves…

Here ya go! Looks like the lr = XX method may not actually change anything.

equals : 83% acc 30% val_acc
set_value : 22% acc 19% val_acc
no change : 85% acc 28% val_acc

1 Like

I think I can disprove your hypothesis :slight_smile:

vs:

Curses! :slight_smile: good catch though. What’s the result look like if you don’t set the learning rate at all? Also, 0.001 is the default, so maybe try two non-default values, for robustness.

In any case, it’s still unclear why there’s a difference between results for set_value vs. =. Can you even reproduce that on your end?

…hmmm…

1 Like

Curious, whats the verdict here? @jeremy and @robin ? :slight_smile:

I’m not sure - I’m continuing to use lr= rather than lr.set_value, since it works for me :slight_smile:

1 Like

Did you find solution for this? Is there difference between Theano and Tensorflow as a backend?

I hit a problem with ModelCheckPoints when using “model.optimizer.lr = 1e-5” style and here is the thread with more details:
https://github.com/fchollet/keras/issues/5218#issuecomment-275931244

Looks like the correct way to do it is (as per that thread :slight_smile: )
from keras import backend as K
K.set_value(model.optimizer.lr, 1e-5)

2 Likes

@robin were you able to figure out why different initialization methods lead to different accuracies?