At this timestamp @jeremy show how he calculates the “global loss” for all the epochs which is given by this line:
avg_loss = avg_loss * avg_mom + loss * (1-avg_mom)
with initial values of:
avg_mom=0.98
batch_num,avg_loss=0,0.
I was just wondering: Is this way of calculating the global loss (over all the epochs) is better than just averaging all the loss? And moreover, is this technique generalizable? For instance if I use Adam/RMSProp/Adamax as optimizer, will it work? (considering RMSProp default momentum is 0).
I always use an “Average meter” to calculate this global loss, like this:
class AverageMeter(object):
"""Computes and stores the average and current value"""
def __init__(self):
self.reset()
self.val = 0
self.avg = 0
self.sum = 0
self.count = 0
def reset(self):
self.val = 0
self.avg = 0
self.sum = 0
self.count = 0
def update(self, val, n=1):
self.val = val
self.sum += val * n
self.count += n
self.avg = self.sum / self.count
And I used it like so:
for epoch in range(...):
[...]
# forward
logits = self.net.forward(inputs)
# backward + optimize
loss = loss_fnc(logits, targets)
optimizer.zero_grad()
loss.backward()
optimizer.step()
losses.update(loss.data[0], batch_size)
logs = metrics_list(targets, logits)
[...]
return losses.avg
But I wasn’t sure if that was the right way to calculate the global loss. Any thoughts on this?