Why add 1e-7 to stds in Tabular Normalize?

mdalvi · September 30, 2019, 7:48pm

Hello Friends,

Walking through the definition of

class Normalize(TabularProc):

I found the code below,

self.means[n],self.stds[n] = df[n].mean(),df[n].std()
df[n] = (df[n]-self.means[n]) / (1e-7 + self.stds[n])

Any reason to why add 1e-7 to standard deviations before division?

TomB · September 30, 2019, 7:57pm

So you don’t divide by zero which will give an error (or make the all the values `inf`` - infinity). This is commonly called an epsilon, and you’ll see it in most places that divide by a value that could be 0.

mdalvi · September 30, 2019, 8:01pm

@TomB

Gotcha! Thanks…