Different training accuracy using model.optimizer.lr: .set_value vs. =

robin · November 28, 2016, 5:41pm

I’ve been tweaking my learning rates after a few epochs using model.optimizer.lr = new_lr and model.optimizer.lr.set_value(new_lr). If I use the former my results after a few more epochs line up pretty well with Jeremy’s. If I use the .set_value method, the “proper” way according to this github thread, I get different, much worse results for the same number of epochs.

Another method I haven’t tried is to use the Keras backend k to update the learning rate: k.set_value(self.model.optimizer.lr, new_lr).

Any idea what’s going on behind the scenes that would make these methods behave differently?

jeremy · November 28, 2016, 7:42pm

Wow - wacky!.. I’ve been wondering about something similar over the last few days, which is: why does the way I’ve been doing it work at all? I’m kind of surprised it does…

Could you show an example notebook in a gist that demonstrates the two different behaviors (and in particular, the worse behavior when doing set_value())? You should probably also show what happens when you don’t change the learning rate at all, so we have a control group.

We may need to ask on the keras mailing list if we can’t figure it out ourselves…

robin · November 29, 2016, 3:16am

Here ya go! Looks like the lr = XX method may not actually change anything.

equals : 83% acc 30% val_acc
set_value : 22% acc 19% val_acc
no change : 85% acc 28% val_acc

gist.github.com

https://gist.github.com/robinkraft/dcd103d7e29f6e9d06f0b5685cc2a44a

learning_rates.ipynb

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [

This file has been truncated. show original

jeremy · November 29, 2016, 11:10pm

I think I can disprove your hypothesis

vs:

robin · November 30, 2016, 4:53am

Curses! good catch though. What’s the result look like if you don’t set the learning rate at all? Also, 0.001 is the default, so maybe try two non-default values, for robustness.

In any case, it’s still unclear why there’s a difference between results for set_value vs. =. Can you even reproduce that on your end?

jeremy · November 30, 2016, 4:19pm

…hmmm…

sravya8 · December 1, 2016, 10:57pm

Curious, whats the verdict here? @jeremy and @robin ?

jeremy · December 5, 2016, 5:45am

I’m not sure - I’m continuing to use lr= rather than lr.set_value, since it works for me

tmu · January 15, 2017, 2:59pm

Did you find solution for this? Is there difference between Theano and Tensorflow as a backend?

sravya8 · January 30, 2017, 8:28pm

I hit a problem with ModelCheckPoints when using “model.optimizer.lr = 1e-5” style and here is the thread with more details:
https://github.com/fchollet/keras/issues/5218#issuecomment-275931244

Looks like the correct way to do it is (as per that thread )
from keras import backend as K
K.set_value(model.optimizer.lr, 1e-5)

hasib_zunair · October 6, 2018, 6:38am

@robin were you able to figure out why different initialization methods lead to different accuracies?