A small thought about the meaning of "overfit first"

On one of the first slides of lesson 8 Jeremy shows the 2 steps for model training:

  1. Overfit.
  2. Reduce overfitting.

I completely agree with Jeremy’s definition of overfitting: “the condition when the validation loss starts to increase”, but this condition does not necessarily imply that our model has captured the complexity of the underlying function.

Even a very simple model can overfit in that sense. In the context of model training, I think we aim for both overfitting (as defined above) and also strongly reducing training loss so we are sure we can at least reproduce the training target. Then we can start #2 - namely try to reduce the overfitting model.

Any thoughts?


3 posts were merged into an existing topic: Lesson 8 (2019) discussion topic