Hyperparameter Tuning and number of epochs

adeperio · September 19, 2019, 1:47am

So far, when tuning my hyperparameters (learning rate, batch size etc), I have been using fit one cycle, training for 3 epochs, and observing my metrics, to determine what might be the best hyperparameters for my models.

Is 3 epochs enough? Would using 1 epoch be sufficient to find the best hyperparameters?

muellerzr · September 19, 2019, 1:48am

Best bet for what I would do is reproduce it and measure for statistical significance.

EG try a baseline, and then see what is statistically significant against the baseline, and then move from there (it could be as little as five runs)

And it would be running until you notice changes. That is how I would dictate “enough”

adeperio · September 19, 2019, 1:56am

When evaluating, I’ve been using a baseline as suggested and comparing other experiments against that.

I was hoping though 1 or a low number of epochs would be enough to predict the likely model performance if running for more epochs with the chosen hyperparamaters. But sounds like I will need to watch for a statistically significant change to be certain.

Thank you!

muellerzr · September 19, 2019, 1:58am

No problem It also depends on the problem. EG image classification 5 would be enough depending (look at the ImageWoof experiments) but if it’s tabular it could need less (2-3)

adeperio · September 19, 2019, 2:08am

Oh great, thanks, that sounds like a good rule of thumb I could look to use