Running one epoch at a time or all at once?

PedroVeira · February 22, 2019, 6:26pm

Hello everyone,

Imagine if I want to run 10 epochs in my model. I could run all of them using the same function, like:

learn.fit_on_cycle(10, slice(learning_rate))

Or I could run them one at a time, like:

learn.fit_on_cycle(1, slice(learning_rate)
learn.fit_on_cycle(1, slice(learning_rate)
…
learn.fit_on_cycle(1, slice(learning_rate)

I would like to know if both of these approaches do the same thing. Because on my experiments, doing another epoch can reduce the accuracy of the model. Using the one at a time approach I could save each stage and return to the most successful if I want, right?
Does anyone has a better way of doing that or know if I am completely wrong about it?

yeldarb · February 22, 2019, 8:42pm

They’re not equivalent. fit_one_cycle varies the learning rate across epochs. It starts slow, increases the learning rate up to the max_lr value you pass (or 0.003 if you pass nothing and use the default), and then slows it back down.

You can read more about this here:
https://docs.fast.ai/callbacks.one_cycle.html

jls · February 23, 2019, 3:58am

Hi Brad, I believe fit_one_cycle varies learning rate across batches instead of epochs.

yeldarb · February 23, 2019, 4:24am

Sorry, I should have been more precise.

You’re right, the learning rate does get adjusted after each batch but the cycle spans the total number of epochs (it’s one cycle per fit_one_cycle, not one cycle per epoch).

thondeboer · February 23, 2019, 9:18am

AH…That explains a LOT…Thanks

PedroVeira · February 24, 2019, 5:37pm

Ok, so if I ran 10 epochs but observe that the model could get better accuracy fitting some more cycles, should I load the learning rate before those 10 cycles (restart the model) and run another 12? I don’t have a way to fit some 2 more using the same values? Or just go for refining it with the unfreeze function?

Thanks for the help!

yeldarb · February 24, 2019, 5:43pm

Either probably works.

Jeremy got asked this question in the Lesson 2 video and said he didn’t know if one would be better than the other or not. (Look around 71 minutes into the video)