Useful to rescale RGB coefficients?

oregontrail256 · January 19, 2017, 7:18am

I was reading this tutorial on data augmentation with Keras, and the author recommended rescaling the RGB coefficients of the input images by providing a rescale argument of 1./255 to ImageDataGenerator():

“rescale is a value by which we will multiply the data before any other processing. Our original images consist in RGB coefficients in the 0-255, but such values would be too high for our models to process (given a typical learning rate), so we target values between 0 and 1 instead by scaling with a 1/255. factor.”

But it doesn’t look like the sample notebooks (dogs v cats, state farm, etc) have been doing that. The util.py file doesn’t do that in its get_batches() function. Is there actually value in rescaling, or should a good optimizer + appropriate tuning of the LR deal with this?

jeremy · January 21, 2017, 12:28am

The pre-trained network we use instead subtracts the mean from the pixel values. It’s another way to rescale. You have to be sure to use the same method that the original network authors used whenever you use a pretrained network. (e.g. keras’ inception network uses the 1/255 approach, where vgg subtracts the mean)

iNLyze · January 22, 2017, 11:52pm

Related: In the lesson 7 notebook you realized mean subtraction through a Lambda layer. I couldn’t run your code straight since Iam using TF backed in keras, not Theano. So, I refactored your VGG model in keras, but obviously it doesn’t have the Lambda layer for mean subtraction. I failed at adding this layer to the existing pretrained model. Any suggestions on how to do it?

iNLyze · January 23, 2017, 12:06am

My code snippet to create VGG

input_tensor = Input(shape=(IMAGE_ROW_DIM, IMAGE_COL_DIM, CHANNELS))
#x = Lambda(vgg_preprocess, input_shape=IMAGE_SIZE+(CHANNELS,))
vgg = VGG16(include_top=False, input_tensor=input_tensor)

#model_vgg = Model()

#model.add(Lambda(vgg_preprocess, input_shape=size+(3,) ))
#model_vgg = Model(input=vgg.input, output=predictions)

# Delete last max pooling layer to be replaced by FCN parts
vgg.layers.pop()
vgg.layers.pop()
vgg.outputs = [vgg.layers[-1].output]
vgg.layers[-1].outbound_nodes = []

vgg.compile(optimizer=optimizer, loss=objective, metrics=model_metrics)

vgg.summary()
#print('Input: '+str(vgg.input_shape), 'Output:'+str(vgg.output_shape))