Hi can anyone explain why Jeremy, later in the notebook, is computing validation loss like this:
with torch.no_grad(): losses, nums = zip( *[loss_batch(model, loss_func, xb, yb) for xb, yb in valid_dl] ) val_loss = np.sum(np.multiply(losses, nums)) / np.sum(nums)
when originally he was calculating like this:
with torch.no_grad(): valid_loss = sum(loss_func(model(xb), yb) for xb, yb in valid_dl) print(epoch, valid_loss / len(valid_dl))
Why this change and what is the difference mathematically?