Well, they have used 1,250 and then changed the error metric for the loss function. Still, just look who made those experiments: NVidia- they produce GPUs. If they have a spear server, then they don’t mind performing experiments. Still, in lessons, they say only a few epochs (or cycles?) are needed and that after running for 4-8, you can run “lr_finder” again to run new learning cycles. Well, I’ve used like 20 epochs (cuz I was training on my laptop overnight). There was a similar question I guess How much epochs to train for using OneCycle Policy? but I haven’t read it through.