Obtain `learn.fit` results as a dataframe

nithanaroy · October 4, 2019, 6:59pm

It would be nice if learn.fit or learn.fit_one_cycle return a dataframe with the results for logging and Tensorboard like visualizations

nithanaroy · October 4, 2019, 9:17pm

Currently this could be achieved in a hacky way by redirecting stdout,

import io
from contextlib import redirect_stdout

with io.StringIO() as buf, redirect_stdout(buf):
    learn.fit_one_cycle(3, slice(lr))
    results = buf.getvalue()
    df = pd.read_csv(io.StringIO(results), sep="\s+")

but we miss the progress completely.

ilovescience · October 4, 2019, 10:39pm

If there is something special you want to do with fastai, a good idea is to check if there’s a callback for it over here. Indeed there is

I have used this often. The only problem with it is if training is interrupted, it will not save. Otherwise, it works perfectly.

tmjiang · October 5, 2019, 6:18am

For interrupted training, CSVLogger probably won’t work, as @ilovescience aforementioned.
Instead, one may use notebook magic %%capture with console logging version of fastprogress.

nithanaroy · October 5, 2019, 6:14pm

These are great ideas. But I think saving and restoring stats from a file isn’t optimal due to unnecessary IO. Something like Keras fit() history is quite helpful in distributed training as we need not worry about unique history file names

tmjiang · October 5, 2019, 11:39pm

There is also the TensorBoard callback using file I/O like the TensorFlow version.
Avoid using files is possible, just like what you or %%capture do. In my opinion this is more about doing it internally as callbacks or externally as decorators.

ilovescience · October 6, 2019, 12:32am

I am still unsure why CSVLogger is not good enough?

jeremy · October 7, 2019, 4:52am

The fit() history is available in the recorder attribute of the learner, IIRC.

nithanaroy · October 7, 2019, 11:44pm

Yup, it has all the pieces needed - train loss per batch, valid loss per epoch, accuracy per epoch. Is there any method that stitches all this data and returns a dictionary / data frame, especially for training loss? Something like the table that gets printed during fit() - https://docs.fast.ai/basic_train.html#Recorder.plot_losses

ilovescience · October 7, 2019, 11:46pm

Still not sure why the CSV callback will not work. Or even if it does not, using the source code for the callback and adapting it for your own use-case could be helpful.

nithanaroy · October 8, 2019, 12:14am

It incurs additional IO of writing / reading to / from disk and puts the burden on the user to collect the right metrics in distributed training environments where multiple learners are working in parallel sharing the same disk. My thought is it would be nice if this complexity could be handled by the library itself like Keras does and simply return the metrics and losses on learn.fit() call.

ilovescience · October 8, 2019, 12:34am

Hmm I am not sure that is true. CSVLogger is simply taking the information from the Learner.Recorder object and saving it into a CSV IIRC.