In a ReLU network each neuron can only create one bend in a curve.

If you have a conventional ReLU network with 1 million neurons in a layer, *each* neuron must have 1 million weight parameters linking back to the prior layer.

That is 1 million parameters to get one bend in a curve. That is intuitively a bit excessive. How much is a good number then? 1000 parameters per curve bend, 100? Is the optimal actually right at the other extreme at 1 or 2 parameters per curve bend, if you can arrange such a thing?

Example:

https://s6regen.github.io/Fast-Transform-Neural-Network-Evolution/

Backpropagation version:

https://s6regen.github.io/Fast-Transform-Neural-Network-Backpropagation/

1 Like

I think I get something of what you are saying. But if the prior layer has **N** neurons, won’t there be **N** (not 1 million) weight parameters connecting each neuron to the previous layer?

Yes, sure. For simplicity I assumed uniformity of layer design.

I used 1 million as an extreme example to highlight the issue.

Maybe Ankit Patel is starting to inch toward some understanding in this video:

https://youtu.be/QEWe-aRBUAs