Expand Recorder to deal with non int/float data

(Stas Bekman) #1

Currently Recorder can only deal with int/float data, so for example CSVLogger can’t save the time column of per-epoch training. Time could be represented as a float secs, but then its display won’t be user friendly - so perhaps a different approach would be to have a way to show data differently than it’s stored. But then perhaps someone will want to store strings. So more flexibility is needed.

0 Likes

Dev Projects Index
Some issues with callbacks
(Ilia) #2

@stas I guess this one was already fixed, right? I see there is a fragment that handles time and string metrics.

0 Likes

(Will Sutton) #3

Looks like time field still doesn’t load in csv_logger.read_loggerd_file() as of 1.0.51.

Maybe there’s a fix on development branch?

If not, I’d volunteer to finish up this issue. @stas, you handle contributors for this issue, right?

0 Likes

(Stas Bekman) #4

No, I’m not a coordinator - it was just that a user asked for this feature and Sylvain said that perhaps it could be implemented by someone who would like to contribute, so I logged it here. Apologies, I didn’t ask Sylvain for how he wanted it in first place, and currently he’s very busy with the course to follow up. So I can’t coordinate something I don’t know how it’s supposed to be. (note to myself, don’t propose a project on behalf of someone else without a clear spec)

So, Will, you may implement it but there is no spec, so it may or may not fit with Sylvain. If you do - just submit a PR. Alternatively, will need to wait for a few more weeks when the course is over.

Meanwhile he suggests there are workarounds using fastprogress:

  • You can edit the global constant WRITER_FN in fastprogress
  • Better SAVE_PATH takes a file location and will do the logging for you
0 Likes

(Will Sutton) #5

I’ve got the fix, see demonstration below.
@sgugger: I’m going to send this in as a PR unless you have objections.

It just ports the same processing in basic_train:Recorder, for a diff of +7/-2.

The only issue I see is that dataframe interprets the time value as strings / object-type. (There does appear to be pandas type to handle “timedeltas” but they seem more troublesome than helpful)

the diff file for changes: diff
notebook showing the fix: notebook

1 Like

(Will Sutton) #6

Noted, and thank you!

0 Likes