Hyperparameter Tuning and number of epochs

So far, when tuning my hyperparameters (learning rate, batch size etc), I have been using fit one cycle, training for 3 epochs, and observing my metrics, to determine what might be the best hyperparameters for my models.

Is 3 epochs enough? Would using 1 epoch be sufficient to find the best hyperparameters?

Best bet for what I would do is reproduce it and measure for statistical significance.

EG try a baseline, and then see what is statistically significant against the baseline, and then move from there (it could be as little as five runs)

And it would be running until you notice changes. That is how I would dictate “enough”

3 Likes

When evaluating, I’ve been using a baseline as suggested and comparing other experiments against that.

I was hoping though 1 or a low number of epochs would be enough to predict the likely model performance if running for more epochs with the chosen hyperparamaters. But sounds like I will need to watch for a statistically significant change to be certain.

Thank you!

1 Like

No problem :slight_smile: It also depends on the problem. EG image classification 5 would be enough depending (look at the ImageWoof experiments) but if it’s tabular it could need less (2-3)

1 Like

Oh great, thanks, that sounds like a good rule of thumb I could look to use :slight_smile: