Error in calculating mean and standard deviation of channels of image

n00b_1o1 · October 20, 2018, 2:41pm

so i am trying to find the mean and the standard deviation of RGB . So what I did was I loaded the images as a matrix which had dimensions [50000, 32, 32, 3] so what I did was i said
Matrix.mean(axis=(0,1,2)
Matrix.std(axis=(0,1,2)

The answer which i got were
array([0.44653, 0.48216, 0.4914 ]) for mean
array([0.26159, 0.24349, 0.24703]) for std

but the answers are
np.array([ 0.4914 , 0.48216, 0.44653] for mean
np.array([ 0.24703, 0.24349, 0.26159]) for std

some how my first and the last values are getting exchanged i cannot understand why it is happening can someone please help me!!!

P.S. matrix is the variable name for the image matrix

Kobe430am · October 20, 2018, 6:05pm

Perhaps try this, not the fastest way, but it works

TRAIN = Path("data/cifar10/train_original") # image directory
images = (plt.imread(str(i)) for i in TRAIN.iterdir()) # generator comprehension
images = np.stack(images)   
images = np.reshape(images.T, (3,-1)) # images.shape = (3,5120000)
np.mean(images, axis =1)
np.std(images, axis = 1)

Kaspar · October 20, 2018, 9:00pm

rMean, gMean, bMean, rSD, gSD, bSD
0.49186775,0.48265323, 0.4471764, 0.20215076, 0.19926804, 0.2008907

Pr image i calculate:
def channelStat(im): return np.mean(im[:,:,0]), np.mean(im[:,:,1]), np.mean(im[:,:,2]), np.std(im[:,:,0]), np.std(im[:,:,1]), np.std(im[:,:,2])

Having this array of 6 for all images i take the of all rMean gMean etc

The mean: We should get identical mean values for all methods .
The std. : Your std should be larger than mine because the std of each channel across all images will be larger than the mean of the std for the channel pr image. Looking at jeremy values i believe he has taken the std pr channel across alle images

Kobe430am · October 21, 2018, 1:15am

Oh nice. So I should improve my code to

TRAIN = Path("data/cifar10/train_original")
images = (plt.imread(str(i)) for i in TRAIN.iterdir()) # generator comprehension
images = np.stack(images)  # this takes time 
np.mean(images[:,:,:,0]),np.mean(images[:,:,:,1]),np.mean(images[:,:,:,2])
np.std(images[:,:,:,0]),np.std(images[:,:,:,1]),np.std(images[:,:,:,2])

(0.49139923, 0.4821585, 0.44653007)
(0.24703227, 0.24348488, 0.261588)

Kaspar · October 21, 2018, 11:23am

nice spot on with jeremys calculations