# How does the loss function know whether the prediction is correct or not?

Hey guys, currently re-reading chapter 04_mnist_basics and I got stumped on the following example:

Suppose we had three images which we knew were a 3, a 7, and a 3. And suppose our model predicted with high confidence (`0.9` ) that the first was a 3, with slight confidence (`0.4` ) that the second was a 7, and with fair confidence (`0.2` ), but incorrectly, that the last was a 7.

They then created two variables to represent the predictions and the targets:

``````trgts  = tensor([1,0,1])
prds   = tensor([0.9, 0.4, 0.2])
``````

Then after creating the loss function:

``````def mnist_loss(predictions, targets):
``````

They ran it as follows:

``````Input: torch.where(trgts==1, 1-prds, prds)
Output: tensor([0.1000, 0.4000, 0.8000])
``````

To calculate the final loss as a scalar, they ran:

``````Input: mnist_loss(prds,trgts)
Output: tensor(0.4333)
``````

Here is what stumped me:
Afterwards it was taught that by changing the prediction for the one “false” target from 0.2 to 0.8, it would cause the loss to decrease which indicates a better prediction.
Indeed after writing the code, it does:

``````Input: mnist_loss(tensor([0.9, 0.4, 0.8]),trgts)
Output: tensor(0.2333)
``````

This obviously does not make sense. Since the “model” predicted the wrong target with a high level of confidence, that should indicate that the model is not performing so well and should thus increase the loss, not lower it.

I believe I am missing something here and I think it lies within how the torch.where function seems to work. I’m not 100% sure though. Any help is greatly appreciated.

Hi @chxnge

I think that the below line is counterintuitive:

``````Input: mnist_loss(tensor([0.9, 0.4, 0.8]),trgts)
Output: tensor(0.2333)
``````

Have a look at the original tensors:

``````prds   = tensor([0.9, 0.4, 0.2])
trgts  = tensor([1,0,1])
``````

so basically, the model predicted 0.2 for the last image (prds[0,3]) and should have predicted 1 instead (to be correct), thus loss for that image was 0.8 (1-0.2).

Keep in mind that function mnist_loss takes predictions and targets as input:

``````def mnist_loss(predictions, targets):
``````

so in our case:

``````mnist_loss(tensor([0.9, 0.4, 0.8]),trgts)
``````

returns:

``````tensor([0.1,0.4,0.2 #<- this is 1-0.8 as the target was one ])
``````

vs input mnist_loss(tensor([0.9, 0.4, 0.2]),trgts)

``````tensor([0.1,0.4,0.8 #<- this is 1-0.2 as the target was one ])
``````

Really, the confidence here is the prediction, so how close to 1 (in the case of the 3rd image) this is.

I hope that this makes sense?