Hi All

I am trying to understand below things about the Running batchnorm impl

- In his explaination jeremy mentioned we use Ex-E[x^2] formula as variance is not constant and batch size vary too.

I m not able to understand why would the bs vary except may be for last one or so . making bold the confusing part

Below is chat transcript while he was explaining the detail 1

“”"

we take the running average of variants

138:14

but you can’t take the running average

138:16

of areas it doesn’t make sense to take

138:19

**the running average of variants it’s a**

**138:20**

**variance you know you can’t just average**

**138:23**

**a bunch of variances in particularly**

**138:26**

**because they might even be different**

**138:28**

**batch sizes right because batch size**

**138:30**

**isn’t necessarily constant right instead**

138:32

as we learnt earlier in the class the

138:36

way that we want to calculate variance

138:39

is like this sum of two values of a mean

138:45

of x squared minus mean of X""""

- Similarly in detail 2

he said about keeping track of batch size ,i m unable to understand bold line in the chat transcript below because X.numel should always return size as bs we pass.

" that we have to be careful of detail

139:44

**number two is that the batch size could**

**139:47**

**vary from from any batch to mini batch**

139:50

so we should also register a buffer for

139:54

count and take an exponentially weighted

139:57

moving average of the counts of the

139:59

batch sizes"

- In the below peace of code ,what is purpose of new_Tensor .

self.count.new_tensor(x.numel()/nc)

What is the .new_tensor and numel here ,what is the purpose ?

I am stuck in this lesson as unable to understand these things…