Why Relu works at all when Gradient=0?

If a single gradient becomes 0, there are nearly thousand other backpropogated units which have nonzero gradient (actually 1) gets affected by it because when we find dJ/dw=…= 1101111*…1, 0 is multiplied and it ruins everything (1 useless vote rules over 100’s useful votes ). How does rely actually overcomes this?

I know leaky relu is a solution to this problem but I want to know how Relu works if this is what actually happens many times.

I think this is a duplicate of this one.