Calculating Standard Deviation of Images (Lesson 7)

fortydegrees · May 9, 2018, 12:23pm

So at around 1:06:30 of Lesson 7, Jeremy talks about providing the mean and standard deviation for each RGB channel for the set of images. He has pre-calculated these and recommends his students to calculate them for themselves.

Calculating the mean I found easy enough:

x = np.array([np.array(Image.open(f'{PATH}/train/' + fname)) for fname in files])
mean = [0,0,0]
for image in x:
    mean += np.mean(image, axis=(0, 1))
mean / len(x) / 255

However, when calculating the standard deviation, I’m struggling. I’m currently using this:

sd = []
for image in x:
    sd.append(np.std(image, axis=(0,1)))
np.array(sd).mean(axis=0) / 255

However that gives me values of [0.2022 , 0.19932, 0.20086], which are incorrect.

I’m pretty sure my problem is just in not properly understanding how the standard deviation is properly calculated. I have experimented with ddof=1, though that seems to make little difference.

Any advice or resources that could lead me in the right direction would be greatly appreciated. Thanks!

draz · May 25, 2018, 10:20pm

@fortydegrees You should first stack all the values per channel together in a for loop (use append), and then calculate the np.std.

Ghiya6548 · October 29, 2018, 4:44pm

hi @draz,

i am still not able to figure out standard deviation calculation.
Here is what i have done:
PATH = “cifar/”

files = os.listdir(f’{PATH}train/')

x = np.array([np.array(Image.open(f’{PATH}train/’ + fname)) for fname in files])

for image in x:
c = 0
r = image[:,c,c]
r_std = np.std(r)
c+=1
g = image[:,c,c]
g_std = np.std(g)
c+=1
b = image[:,c,c]
b_std = np.std(b)
std +=(np.array([r_std,g_std,b_std]))

Have calculated standard deviation for each channel for each image and summed them up. Does this make sense or am i doing something wrong?

draz · October 29, 2018, 4:52pm

You should first just stack all the values per channel (not the STDs per channel) and then calculate the STD over the stacked values per channel.

wyquek · October 30, 2018, 5:41am

The batchnorm codes in lesson 7 dropped good hints on how to do this. Alternativelt, maybe try searching the forums for ‘standard deviation’.

Ghiya6548 · October 30, 2018, 6:28am

Thanks a lot for prompt response. this works . such a silly mistake on my part. Should have thought of this.

std =

PATH = “cifar/”

files = os.listdir(f’{PATH}train/')

x = np.array([np.array(Image.open(f’{PATH}train/’ + fname)) for fname in files])

for image in x:
c = 0
r_std = image[:,:,c].flatten()
c+=1
g_std = image[:,:,c].flatten()
c+=1
b_std = image[:,:,c].flatten()
std.append(np.array([r_std,g_std,b_std]))

np_std = np.array(std)

r_channel = np_std[:,0].flatten()
g_channel = np_std[:,1].flatten()
b_channel = np_std[:,2].flatten()

r_norm = np.std(r_channel) / 255
g_norm = np.std(g_channel) / 255
b_norm = np.std(b_channel) / 255
imgs_std = np.array([r_norm,g_norm,b_norm])
print(imgs_std)

Ghiya6548 · October 30, 2018, 6:29am

Thanks. I am actually on that lecture as of now.