Calculating Standard Deviation of Images (Lesson 7)

(David Pratt) #1

So at around 1:06:30 of Lesson 7, Jeremy talks about providing the mean and standard deviation for each RGB channel for the set of images. He has pre-calculated these and recommends his students to calculate them for themselves.

Calculating the mean I found easy enough:

x = np.array([np.array('{PATH}/train/' + fname)) for fname in files])
mean = [0,0,0]
for image in x:
    mean += np.mean(image, axis=(0, 1))
mean / len(x) / 255

However, when calculating the standard deviation, I’m struggling. I’m currently using this:

sd = []
for image in x:
    sd.append(np.std(image, axis=(0,1)))
np.array(sd).mean(axis=0) / 255

However that gives me values of [0.2022 , 0.19932, 0.20086], which are incorrect.

I’m pretty sure my problem is just in not properly understanding how the standard deviation is properly calculated. I have experimented with ddof=1, though that seems to make little difference.

Any advice or resources that could lead me in the right direction would be greatly appreciated. Thanks!


@fortydegrees You should first stack all the values per channel together in a for loop (use append), and then calculate the np.std.