Question about log likelihood


I m reading the fastai book. I have a question about the table in page 199:

I don t understand the results for the losses. I would have taken loss=1-(softmax value of the correct class) instead of loss=the solftmax value of the correct class.

As it is written, if we have a perfect prediction, we also will have a higher loss value. Shouldn t it be the opposite ?!

I guess I missed something but can t figure out what. Can someone please clarify ?

Thanks :slight_smile:

it is the opposite. you don’t have a “perfect prediction” anywhere, you have predicted the likelihood of your input being 3 or 7, some of which are correct.

In row 4 (idx=3) where it was really sure and correct, the probability was .99664 so the loss was very low.

In row 2 (idx=1) it was correct but not very sure, prob was .502065 which is pretty much a 50/50 guess, so the loss was much higher.

1 Like

Hello Joe, thanks for your quick answer.

We have indeed a probability of being a 3 of 0.99664, but targ is 1. I assumed targ equals to 1 corresponds to 7s and when equals to 0 corresponds to 3s. Isn t it the case? I can t find any mention of that in text…

Yes your understanding is correct, the model wrongly predicted the entry for row 4 as target value 3, and as shown in page 200 the softmax+ NLL is kept as -0.00336017. Since in Pytorch NLL just adds the negative sign to the input.

However, if you apply log_softmax+NLL (which is the actual formula of Cross Entropy used in Pytorch) the value corresponding to this entry gets to be 5.6958(ie., -log(0.00336)), which is show in page 202. There by the wrong prediction penalized by higher loss.

It’s neg loss? That would have been useful information :sweat_smile:

I tried finding it in GitHub but I couldn’t locate it for some reason.

4th entry in the tensors shown.

Thanks guys; it s clearer now.

I think that the term « loss » in this table is very misleading :slight_smile: