Image normalization in PyTorch

danielhavir · November 7, 2017, 5:25pm

Hi,

I’m working on an image classification problem. I’m working in PyTorch and I need to normalize the images so that they have a mean 0.0 and a standard deviation of 1.0 (reference: https://cs231n.github.io/neural-networks-2/#datapre ). I implemented the mean subtraction and std division, but I stumbled as the network’s behaviour is strange and I’m not sure with my implementation.

Now, this is my transforms settings:

transform = transforms.Compose([
    transforms.ToPILImage(),
    transforms.ToTensor()
])

which means that the output images are in the range <0,1> because they are PIL images (it doesn’t work without the ToPILImage() transform). They have a mean of 0.75 and std of 0.33.

When I calculated the per color channel mean, I got [ 0.76487684, 0.75205952, 0.74630833] (RGB). Same for std [ 0.27936298, 0.27850413, 0.28387958]. I normalize using the transforms.Normalize(mean,std) (basically (x-mean)/std) ).

My code looks like this (for the sake of simplication, assume I only have 32 images):

dset = CustomDataset('images', transform=transform) #PyTorch Dataset object
dataloader = torch.utils.data.DataLoader(dset, batch_size=32, shuffle=False, num_workers=4)
images, labels = iter(dataloader).next()
# images.shape = ( 32, 3, 80, 80)
numpy_images = images.numpy()

per_image_mean = np.mean(numpy_images, axis=(2,3)) #Shape (32,3)
per_image_std = np.std(numpy_images, axis=(2,3)) #Shape (32,3)

pop_channel_mean = np.mean(per_image_mean, axis=0) # Shape (3,)
pop_channel_std = np.mean(per_image_std, axis=0) # Shape (3,)

Now, is it “wise” to normalize images to have zero mean and std of 1 ? Or do you normalize images to be <-1, 1>. Lastly, is my implementation correct? I’m not sure with the std.

Thanks in advance.

edit:
I figured I should calculated the std this way:

pop_channel_std = np.std(numpy_images, axis=(0, 2, 3)) #Shape (3,)

Although, I have too many images and have to calculate the mean and std cummulatively for batches before calculating the population values.

recastrodiaz · November 7, 2017, 7:07pm

Have you considered using sklearn.preprocessing.StandardScaler? It has a partial_fit(X[, y]) method that you could call on each batch. You could then get the _mean and _var fields to use them as parameters for your PyTorch transform (_mean, sqrt(_var).

danielhavir · November 7, 2017, 10:55pm

Thanks, I’ll try that, it looks promising!

ramesh · November 7, 2017, 11:00pm

If all you want to Normalize your inputs, you might want to add Normalize after you convert it to Tensor in your compose transforms list like -

transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])

See API Docs - http://pytorch.org/docs/0.2.0/torchvision/transforms.html#torchvision.transforms.Normalize

Example: https://discuss.pytorch.org/t/normalization-in-the-mnist-example/457/5

danielhavir · November 8, 2017, 5:58pm

I know that and I did it that way. That’s not the problem. I’m trying to calculate the mean and std in the first place. Thanks anyway.

Will1994 · May 1, 2018, 11:36am

Have you solved the problem? I meet the similar question, but don’t know how to do with it…Could you please show how did you solve it finally?

Thank you!

danielhavir · May 1, 2018, 12:07pm

Hi,

yes. You need to calculate the mean and std in advance. I did it the following way:

transform = transforms.Compose([
    transforms.ToPILImage(),
    transforms.ToTensor()
])

dataloader = torch.utils.data.DataLoader(*torch_dataset*, batch_size=4096, shuffle=False, num_workers=4)

pop_mean = []
pop_std0 = []
pop_std1 = []
for i, data in enumerate(dataloader, 0):
    # shape (batch_size, 3, height, width)
    numpy_image = data['image'].numpy()
    
    # shape (3,)
    batch_mean = np.mean(numpy_image, axis=(0,2,3))
    batch_std0 = np.std(numpy_image, axis=(0,2,3))
    batch_std1 = np.std(numpy_image, axis=(0,2,3), ddof=1)
    
    pop_mean.append(batch_mean)
    pop_std0.append(batch_std0)
    pop_std1.append(batch_std1)

# shape (num_iterations, 3) -> (mean across 0th axis) -> shape (3,)
pop_mean = np.array(pop_mean).mean(axis=0)
pop_std0 = np.array(pop_std0).mean(axis=0)
pop_std1 = np.array(pop_std1).mean(axis=0)

Note that in theory, the standard deviation of the whole dataset is different than if you calculate the std per minibatch and then calculate the final std as a mean of minibatches’ stds (as I did, try to have the batch size as large as possible, I used 4096). The problem is with a huge dataset like mine (>12 mil images), you can never calculate the standard deviation across the whole dataset due to memory constraints. If your dataset is of reasonable size and you can load the whole thing into memory, then you can calculate both mean and std of the whole thing. But in practise, it shouldn’t be a problem if you use the mean of standard deviations of all the minibatches.

Also note, that it’s calculated on the CPU and not the GPU, so if you run on cloud, you can do it on some cheap instance and you don’t have to use a GPU instance.

Once you have the mean and std, just add the following line to the transforms.Compose list:

transform = transforms.Compose([
    transforms.ToPILImage(),
    transforms.ToTensor(),
    transforms.Normalize(mean=*your_calculated_mean*, std=*your_calculated_std*)
])

Hope that helps.

Will1994 · May 1, 2018, 1:16pm

Thanks for your help,I will try it again.

ramesh · May 9, 2018, 2:03am

@danielhavir, @Will1994 - Your code may need to be adjusted as -

pop_means =  [x/255  for x in pop_means]
pop_stds = [x/255 for x in pop_stds]

transforms.ToTensor() converts the values from 0-255 into range 0-1. So the mean and std normalization after that needs to also be adjusted for that.

For example -

labdmitriy · November 27, 2018, 9:19pm

@danielhavir
You can calculate mean/std for all data even using batches, the description you can find here:
http://datagenetics.com/blog/november22017/index.html

I’ve implemented 2 methods of calculating mean/std (using 1 batch for all data, and using batches with size 100) for comparing results, they are almost equal (difference in values only after 4-5th decimal number), normalize for mnist and cifar10 using transforms.Normalize, and check the mean and std again after normalization.
You can see the code here:
https://kyso.io/labdmitriy/homework_transformer#code=both

oliverhe · January 1, 2019, 7:43pm

Hey guys, thanks for sharing!
I know it’s been a while, but I just got the same problem…And after googling a bit and reading the post, I got confused about the axes that needs to be specified while calculating the mean/std of a batch of RGB images.

Could you please elaborate a little bit why you used axis=(0,2,3)?

Some ppl used axis=(0,1,2) as in this post:

I understand we are doing normalization across width and height, but just don’t know how.

oliverhe · January 1, 2019, 8:17pm

because your input is batch_size3img_size*img_size, and you are referring to norm to batch_size, img_size, img_size?

And what does this two std mean? sorry if it’s naive, but i am new to this
batch_std0 = np.std(numpy_image, axis=(0,2,3))
batch_std1 = np.std(numpy_image, axis=(0,2,3), ddof=1)

laochanlam · May 5, 2019, 7:26pm

That represent which dimension to be operated,
just print your numpy_image.size() then you will know why.

anhmt · February 1, 2020, 8:50pm

Hello guys, I have a beginner question: Should we apply the mean and std computed from the training set to the validation and test sets or should we compute the mean and std for each of these sets separately and have corresponding transform objects for them?

Thanks in advance.

umaidzz · May 27, 2020, 8:02am

Tensors have the format (Batch, Channel, Height, Width) while PIL Images have (Batch, height, width, channel)

Jomo · March 28, 2022, 8:36am

Cool… I was wondering why my mean and std is getting different results each time I do the calculation.
It is in fact caused by calculating the two values from different samples from different batches.

That shd make sense now. thanks!

Chandrak · October 2, 2023, 2:56am

You should not use validation & test set information. Another thread on same topic: Shouldn't mean & std be calculated across both the train & validation datasets? - vision - PyTorch Forums

Chandrak · October 2, 2023, 3:01am

batch_std0 – without degrees of freedom
batch_std1 – with degrees of freedom
https://numpy.org/doc/stable/reference/generated/numpy.std.html