Data preprocess: image mean subtraction vs normalization

xinxin.li.seattle · February 9, 2017, 7:40pm

In the end-to-end model for MNIST (lesson 3), we normalize the images in data preprocessing step. But in the vgg finetuning example (lesson 1 cats vs dogs), we subtract the image mean without normalized against standard deviation.

So my question is, when do we normalize the image, and when do we only subtract the mean?

Gelu74 · February 9, 2017, 10:40pm

I’d say there is no written rule and it’d depend on the type of images you are dealing with

xinxin.li.seattle · February 9, 2017, 11:04pm

Do you care to elaborate on that? How does our normalization method change as image type change?

For example, in the case of state farm competition, the images are very different from the imagenet, so with finetuning, you want to tune a couple more layers than just the last one, in that case, you might also want to calculate your own image mean instead of using imagenet image mean for the two datasets are too different. Assume that’s what you are going to do, in the preprocessing step, do you subtract the calculated image mean, or do you normalize it with calculated mean and std? and why so?

Am I thinking along the right line?

Gelu74 · February 9, 2017, 11:10pm

I think if you are using a pretrained model, specially if you fix some layers, you better use the same normalization as the people that trained the model. If not I would say it depends on how dissimilar are your images, but then you can always try different ways and see what works best