ReLU and its effectiveness

balnazzar · January 21, 2018, 6:04pm

We all know the advantages and disadvantages of ReLU with respect to other popular nonlinearities like sigmoid, tanh, etc.

What I struggle to understand is its effectivenes in allowing a MLP to approximate nonlinear functions and separating nonlinear regions.

Relu is, in the end, the most trivial linear function (x, which leaves its input untouched) glued with the constant function 0.
Restricting ourselves to a single neuron, it just leaves the result of a dot product as it is, or suppresses it altogether if it’s negative.

How can relu be a useful nonlinearity? After all, we know that a NN with just linear activations (even the most general mx+q with m,q varying for each layer or even each neuron) would not be capable of separating nonlinear regions (any composition of linear mappings, no matter how long, is just a linear mapping).

Thanks.

raspstephan · January 22, 2018, 7:19am

http://neuralnetworksanddeeplearning.com/chap4.html

Maybe this will help you understand it. He uses a step function as a non-linear function but relu would work equally.

marcemile · January 22, 2018, 11:10am

You should play with

It will give you a good intuition of how the different activations work.

balnazzar · January 22, 2018, 3:17pm

Useful links. But I would have hoped for something a bit more theoretically grounded…

SakvaUA · January 22, 2018, 3:24pm

Here you go:

A good theoretical paper that shows that neural networks are piecewise linear and because of this are susceptible to adversarial examples.

SakvaUA · January 22, 2018, 3:35pm

And BTW places where sigmoid and tanh are highly nonlinear are associated with gradient explosion/vanishing gradients. So they are non-linear, but not quite.

balnazzar · January 22, 2018, 7:24pm

Thanks, I’m sure I’ll enjoy it.

OmarAmin · January 23, 2018, 3:41pm

https://fleuret.org/dlc/

check the lecture number 3 on MLP slide #7 you’ll find a visualization for how relu is able to approximate the nonlinearity

hope it helps.

thanks

balnazzar · January 23, 2018, 4:21pm

Didn’t know that course. Thanks, I think it will be interesting for other stuff too…

EDIT: Ok, found it on handout 3B, it is like I imagined it, but it provided duly justification.

Thanks! That answers my question!