Trying to understand concepts of forceful weight adjustments

So we’ve carefully designed a neural nets and spent time training them to “acquire knowledge” to solve the problems.
This “knowledge” is essentially weights that are being tuned along this process.
Weights are being tuned by math algorithms (optimization functions).

However the concept of dropouts() to randomly “kill” the activations is what i have a confusion about.

While i do understand that dropouts are used to reduce overfitting - it sounds like dropouts are a brute force way to “fix” the design flaw of the neural network.

It’s sort of like: “this complicated and fancy car is doing too much for what i need it for - let me bang it with the hammer and downgrade it’s quality to a bicycle”.
Wouldn’t getting a bicycle in the first place would be a cheaper and easier solution?

Wouldn’t better approach would be to simplify or optimize design architecture of neural net, instead of forcefully negating it internal functions by randomly “killing” activations?

1 Like

No, dropout is much cooler than that! :slight_smile: Here’s a paper that describes some of the amazing properties:

1 Like