in linear regression for example
which degree of polynomial or equivalently which number of regularization parameter (lambda) is the best one to select in the figure below
should we just select the one (lambda)
which has the lowest value in just validation_set error
or we should consider training set error as well ?
maybe we would over fit on validation set if we just take validation set error into the considerations