Normalize(imagenet_stats) explained

dats-vs-cogs · September 11, 2019, 11:11am

Hey guys,

I tried reading the docs and looking for an answer on forums - but both are too advanced for my level.

I’m trying to understand:
a) what exactly does .normalize() do? Why do we need to add it at the end?
b) what does imagenet_stats mean? And how are those stats different to cifar_stats,etc?

Also, where in the docs can I find detailed explanations of these things? (assuming I missed them)

Thanks!

jmenke · September 11, 2019, 12:07pm

Hi,

.normalize normalizes your data. So Neural nets “wants” your data to have a mean of zero and a standard deviation of 1. You can achieve that through normalization.

Its done by subracting from each score the mean of the variable and then deviding it by the standard deviation of that variable.

However, if you use the imagenet pretrained model. You do not want to use the mean and standard deviations of your data but the mean and standard deviation of the original imagenet data.

So the mean and sd of the imagenet data are stored in the imagenet_stat variable.

dats-vs-cogs · September 12, 2019, 9:24am

Thanks @jmenke.

Maybe a silly question - but how can an image have a mean and a std dev?

If it’s a 1d number array, eg [-5 -4 -3 -2 -1 0 1 2 3 4 5] - I get it. But if it’s a 3d tensor?

Or are you referring to the output scores, not input values?

Is there something I can read on this online? Like a blog post?

Ilja

jmenke · September 12, 2019, 9:56am

Hey,

so I am assuming they are standardizing the three color channels. So each picture has a Red, Green and Blue Channel, and each of these Channels is represented as a matrix. I think you would take the mean and the sd of the whole matrix. And you would dp that for each color

juliangrosshauser · September 12, 2019, 2:12pm

Is there something I can read on this online? Like a blog post?

The “Data Processing” section in the CS231n course notes explains it pretty well I think.

dats-vs-cogs · September 15, 2019, 7:41pm

Ok now everything clicks. Thanks for sending that.

@jmenke I now see what you meant. Out of curiosity, why do we need to divide by mean and std_dev of imagenet data if we’re using imagenet model? Like that does that do for us?