Batch Normalisation vs Dropout

Batch Normalisation vs Dropout

According to the batch normalization paper https://arxiv.org/pdf/1502.03167.pdf

When we use batch normalization the Dropout is not needed and should not be used in order to maximum benefit from batch norm.

From the paper (4.2.1, page 6) Batch Normalization fulfills some of the same goals as Dropout. Removing
Dropout from Modified BN-Inception speeds up training, without increasing overfitting

From the other side, from lesson 4 (video) it looks, that adding Dropout in addtion to Batchnorm makes an improvement.

@jeremy

Batch normalization has an effect towards reducing overfitting because it is noisy – it depends on which data points get batched together. Dropout has a similar effect. Depending on your level of overfitting, you might need neither, one, or both.