Problem: The column indexing approach to calculating loss for classification problems with more than 2 classes does not return the loss, but the negative probabilities
In chapter 4 of Deep Learning for Coders with Fastai and Pytorch, Jeremy describes using an indexing approach to calculate the loss on a ‘3’ vs ‘7’ binary classification problem like so:
softmax_activations = torch.softmax(acts, dim=1)
softmax_activations
tensor([[0.6025, 0.3975],
[0.5021, 0.4979],
[0.1332, 0.8668],
[0.9966, 0.0034],
[0.5959, 0.4041],
[0.3661, 0.6339]])
target = tensor([0,1,0,1,1,0])
index = range(6)
softmax_activations[index , target ]
tensor([0.6025, 0.4979, 0.1332, 0.0034, 0.4041, 0.3661])
By indexing into the ‘3’ activation column when target = 0 (i.e. a ‘7’) and the ‘7’ column when the target is 1 (i.e a ‘3’), you get the loss.
Jeremy says that this can scale to classifying 2+ classes, but I don’t see how. You would need to sum all other columns except your target to get the loss surely? However when I use F.nll_loss, which uses the indexing approach, on a multiclass classification example like so:
sm_acts = torch.softmax(acts, 1); sm_acts
tensor([[0.7189, 0.1380, 0.1431],
[0.0443, 0.1951, 0.7606],
[0.5092, 0.3590, 0.1318],
[0.1392, 0.1421, 0.7187],
[0.0353, 0.6423, 0.3224],
[0.2209, 0.1252, 0.6540]])
targ = tensor([0, 0, 1, 1, 2, 2])
F.nll_loss(sm_acts, targ, reduction=‘none’)
tensor([-0.7189, -0.0443, -0.3590, -0.1421, -0.3224, -0.6540])
It gives me the negative probabilites for each class, not the loss. Can someone clear this up for me?