Some issues with callbacks

Hi @stas
when I use PeakMemMetric, It seems that I cannot get the the time per each epoch saved into a CSV log file (it shows alright on the screen but doesn’t save it to file). It has heading for the time but cells are empty per epoch.

Also under ‘gpu used’ is is showing -3, I wonder what -3 gpu used means for one epoch?

here is the code I have used related to above output:

from fastai.callbacks.mem import PeakMemMetric

learnS = cnn_learner(data, arch,pretrained=True,
metrics=[accuracy, error_rate],
callback_fns=[partial(CSVLogger, filename =str(aName))
, ShowGraph, PeakMemMetric,
partial(SaveModelCallback, monitor =‘val_loss’, mode =‘auto’,name = mName ),

                             ])

I created a topic for PeakMemMetric and moved your post here, so that we don’t discuss it in a thread dedicated to a different module.

Are you saying PeakMemMetric is interfering with CSVLogger and if you remove the former the latter behaves as expected?

Also under ‘gpu used’ is is showing -3, I wonder what -3 gpu used means for one epoch?

It means that 3MB of GPU RAM was reclaimed. i.e. if you had 1005MB used GPU RAM before the first epoch it measured 1002MB after it (1005-3).

Most likely you had some RAM allocated and its variable was freed but not gc.collected, (learn object would be the prime suspect). Then it’s possible that during that first epoch gc.collect happened (which happens every N calls) and so even though the first epoch has certainly consumed RAM, the freeing of unrelated RAM by gc.collect offset the outcome, resulting in a negative number.

Remember that used memory reported by PeakMemMetric is a delta between the measurement before and after. And peak memory is a delta between used after and the peak (and the peak measurement is currently unreliable). Peak delta is the temporary overhead.

to me CSVLogger do its job well in general but it just fail to write the time per epoch into the file.
or it’s PeakMemMetric that doesn’t pass the time value to CSVLogger properly to be written to the file.

PeakMemMetric or any other built-in callback in general isn’t passing anything on behalf of another callback.

You still haven’t answered my question. When you remove PeakMemMetric from the callbacks, is everything working as you expected it to be and only when you re-add PeakMemMetric things stop working?

Sorry about that.
Yes it does! and it doesn’t stop working, after re-adding PeakMemMetric, it just doesn’t save the the time per epoch in the CSV file. As I showed in the previous comments above the time’s cells remain empty.

Hi @stas

Another similar issue i found in callback_fns
if I use TraceMallocMultiColMetric, as callback_fns = TraceMallocMultiColMetric, …
it works perfectly and output


although still the time is empty.
however if I want to have both Ram and cpu and GPU usage and have this :
callback_fns = PeakMemMetric, TraceMallocMultiColMetric
what I get in the logg file is CPU/GPU usage values with the headings of RAM usage and again nothing for the time per epoch! also headings and values of the last four column are not aligned :smile:

Looks like a bug in CSVLogger, it doesn’t log time all by itself either. I’m not sure how PeakMemMetric is related here.

I guess I have to explain it better, when I add PeakMemMetric to the callback_fns, what we get in log file are only the values that we expect the PeakMemMetric produce, however the headings of the csv file is coming from TraceMallocMultiColMetric, which is : RAM_used, RAM_max_used, and RAM_peak. also we are not getting any values for RAM_used, RAM_max_used, and RAM_peak in the log file. while if we exclude PeakMemMetric, from callback_fns we get the values for RAM_used, RAM_max_used, and RAM_peak correctly.

So you flagged 3 unrelated issues, @Shahinfar.

1 - misleading time header in history.csv w/ no time saved - fixed in master - header was removed
2 - time is not being recorded: needs to be implemented by someone Expand Recorder to deal with non int/float data
3 - bug in multiple callbacks overriding each other - fixed in master

All fixes are courtesy of @sgugger.

So please update to the master git version and hopefully all your problems will go away.

1 Like

Thanks a lot for your prompt support @stas and @sgugger

I have identified another issue which i will post but i need to make sure on my side first before sending it to the forum.
btw, what do you mean by “update to the master git version”, does it mean I have to uninstall fastai and install it again?

I have identified another issue which i will post but i need to make sure on my side first before sending it to the forum.

None of the issues you mentioned so far were about PeakMemMetric. I renamed this thread so it won’t be misleading.

Most likely post a github an issue if and when you identify the problem (after installing the latest fastai dev version, see below).

btw, what do you mean by “update to the master git version”, does it mean I have to uninstall fastai and install it again?

Please see: https://github.com/fastai/fastai/blob/master/README.md#bug-fix-install