Overflow with exp in Kaggle when increasing sizes

javismiles · January 27, 2019, 11:38am

Hi friends,
I’m running different experiments with a net I built from scratch in Python using Kaggle notebooks, and all goes well, but when the size of the input data and/or number of hidden units increases beyond a certain point I begin to receive these warnings:

/opt/conda/lib/python3.6/site-packages/ipykernel_launcher.py:13: RuntimeWarning: overflow encountered in exp
del sys.path[0]

The only part of my code where I use exp is in the sigmoid:
def sigmoid(x):
return 1.0/(1+ np.exp(-x))

and it all works well, trains well etc, until I reach a certain size in data/units, then I begin to get that error,

I am standardizing the data using sklearn:
names = df.columns
scaler = preprocessing.StandardScaler()
scaled_df = scaler.fit_transform(df)
scaled_df = pd.DataFrame(scaled_df, columns=names)

and initializing the weights with:
np.random.randn(layer_size[l],layer_size[l-1])*np.sqrt(2/layer_size[l-1]) ,

and the biases to 0.

any tips, advice about how to deal with those? thanks very much

zearo · January 31, 2019, 3:50pm

Your sigmoid function is not numerically stable.

You should use something like:

def sigmoid(x):
    "Numerically stable sigmoid function."
    if x >= 0:
        z = exp(-x)
        return 1 / (1 + z)
    else:
        # if x is less than zero then z will be small, denom can't be
        # zero because it's 1+z.
        z = exp(x)
        return z / (1 + z)

from this blog post http://timvieira.github.io/blog/post/2014/02/11/exp-normalize-trick/ . The reasons to use it are explained there, but in short, you just need to express the sigmoid in a different (but mathematically identical) way to avoid problems.