When I run long-term experiments, I often lose network connection to the notebook in the middle. I want to come back and check on the learner.fit() output (trn and val loss, acc, etc.), but this often is frozen at some point in the past.
Once training finishes, I can do learner.sched.best, and learner.load(‘saved-file’), but I’m unable to see things like how trn_loss and val_loss relate over time.
How do you assure that this kind of long-running output continues to update after reconnecting to notebook?