Unbalanced classes and overfitting?

In lesson 2, at around 1:40 or a little before, Jeremy responds to a student question about unbalanced classes by suggesting that duplicating the instances of the rare classes can be a good strategy.

My question is: why doesn’t that tend to promote overfitting? Intuitively, if you only have a handful of instances of a rare class, and instead of getting more instances you just repeat the instances you have, then any idiosyncrasies in the original dataset will be magnified. Right?


Since most of the training is done in mini batches - up-sampling gives a chance to have at least one instance in each mini batch. if you up-sample some idiosyncrasies - yes their influence will be magnified, but that’s sometimes what you would want to. I don’t think it directly promotes neither overfitting or underfitting.

