LossMetrics.accumulate() clashes with Learner.get_preds()

dom_miketa · February 24, 2021, 12:05pm

I found this issue while calling Interpretation.from_learner(learn) but AFAICT it appears anytime Learner.get_preds() is called if its metrics also include LossMetrics, coupled with a custom loss that has a reduction attribute*.

*Unless the partial loss attributes in the custom loss object are treated in just the right way – see below.

Learner.get_preds() sets up a context manager which disables reductions, leaving losses with shape (bs). That’s great for things like 'Interpretation.top_losses()`.
During the validation run a call is made to LossMetric.accumulate(). This method (it seems to me) simply assumes that the loss has ‘mean’ reduction; that would explain the multiplication by bs. But now if my validation dataset is not exactly divisible by bs, the linked line will try to add up two tensors of different dimensions, one of size bs and the other of the remainder of len(val_ds)/bs.

There are at least two ways to get around this problem.
a. The issue is trivially fixed by overwriting learn.metrics = [], but that’s quite hacky.
b. LossMetrics is there to read attributes of your custom loss function – so if you have two losses, you just save their values as self.loss1 and self.loss2 and add LossMetrics(["loss1", "loss2"]) to your Learner. If you make sure that the saved attributes are always mean-reduced losses (i.e. scalars), everything will work out just fine.