I had a somewhat noob question. Suppose I have been training a network for say an hour or two and I have a pretty good accuracy. I use lr_find and get an almost flat line.
Would it still be worth my time to train it for a few more hours? I hear of people training a network for 10 hours and it is better. Does this still work if the loss landscape is almost flat and the loss plots are also almost flat.
Thank you in advance!!