Hello Experts,

I have a doubt about what the Professor states in his explanation for how big should the size of the validation set be.

He goes on to say that each of the 3 sets(training, validation, test) should have a minimum of 22 samples of each class. Then further, he states the following:

“So one approach to figuring out is my validation set big enough is train your model 5 times with exactly the same hyper parameters each time and look at the validation set accuracy each time and there is a mean and a standard deviation of 5 numbers you could use or a maximum and a minimum you can use. But to save yourself some time, you can figure out straight away that okay, I have a .99 accuracy as to whether I get the cat correct or not correct. So therefore the standard deviation is equal to 0.99 * 0.01 and then I can get the standard error of that”

From the professor’s explanation, i understand that, to calculate the mean/std of the validation set accuracy scores, we need a minimum of 22 observations. However, here, the professor states that just having 5 validation scores is enough to calculate the mean and std of the validation set accuracy.

Could someone kindly clarify here?

Regards,

Kiran Hegde