Hi all,
I am a little confused by the following section of Chapter 6 of fastbook.
In this case, we’re using the validation set to pick a hyperparameter (the threshold), which is the purpose of the validation set. But sometimes students have expressed their concern that we might be overfitting to the validation set, since we’re trying lots of values to see which is the best. However, as you see in the plot, changing the threshold in this case results in a smooth curve, so we’re clearly not picking some inappropriate outlier. This is a good example of where you have to be careful of the difference between theory (don’t try lots of hyperparameter values or you might overfit the validation set) versus practice (if the relationship is smooth, then it’s fine to do this).
I must admit I belong to the aforementioned concerned students
I have thought about the paragraph for a little while but I cannot wrap my head around it.
How is "smoothness of relationship"
related to "it is ok to fine-tune hyperparams on the validation set"
?
I understand that if the metric-vs-hyperparam curve is smooth, by definition we don’t pick "some inappropriate outlier"
.
Even if this happens, we are still peaking into the “future” (the validation set is assumed to be an unbiased representation of what the real world looks like) and tweak our “past” (the model) to better align to it. Am I wrong?
Thanks, and happy hacking!