Diving by 255 vs (img-mean)/std , should both be done?

Trying to build a model that takes in an image, should I do both these things?
or is the second one only for pretrained models(which I am not using)?

Can I do the second one to a new dataset and a new(non pretrained) model?

Thanks!

1 Like

Hi Vishak!

Normalization is important not just for pretrained models but should be done for every task. There are different methods possible, however.

For pretrained models, you have to apply the same normalization as the original model. This is what fastai does with Normalize.from_stats(*imagenet_stats), applying (img-mean)/std with the values from ImageNet.

When you are training from scratch, normalization is up to you. When you are doing (img-mean)/std, you don’t have to divide by 255 first, it will give the exact same result.

When you only divide by 255, your data will not be centered around 0, but will be between 0 and 1, which is not optimal for training. Apply img-mean instead, and try if dividing by std helps.

Another thing to consider is on what level to calculate the statistics. Per image? Per batch? On the whole training dataset? All are possible but can give different results.

Good luck :slight_smile:

2 Likes

Thanks! That clears it up.

A small follow up question. For an image of dimensions (h,w,c) , the way to find the mean would be

np.mean(array,dims=(0,1))

do a similar thing for std dev and plug them into the transforms.

However the answer to this question seems to suggest something different. Is it different or am I confused about it?

What this answer is suggesting is a kind of normalization where " R + B + G = 1 for every pixel". I don’t see how this would be useful for a neural network, mainly because it’s lacking the most important part, centering around 0.

1 Like

Yeah I didn’t get it either. Thanks for the clarification. helped a lot!