How to interpret the accurary when doing inference

Daguer · September 15, 2021, 1:14pm

I have a image classification model with an accuracy of 0.87

When I am doing inference with it I get a tensor for an image (0.15, 0.85 )

Does my probability of being right is : Model accuracy X tensor probability → 0.87 X 0.85 = 0.74

Is my understanding correct ?

thanks

muellerzr · September 15, 2021, 2:02pm

That is saying it’s 15% class A, and 85% class B

(So the probabilities).

Accuracy during training simply means from your validation set, it got 87% of them correct.

Daguer · September 15, 2021, 5:13pm

Do I read the tensor the same way for a multi-label problem?

class_A , class_B
value_1, value_2, value_3

I combined them to create my 6 labels options:
class_A_value_1, class_A_value_2, class_A_value_3
class_B_value_1, class_B_value_2, class_B_value_3

result : my tensor tuple (0.05, 0.10, 0.15, 0.1, 0.1, 0.5)

my probability would be 50% of being class_B_value_3

Let me know if I understand correctly

thanks
David G.

muellerzr · September 15, 2021, 5:14pm

Correct. You know if this is the case if all of them sum to 1

Daguer · September 15, 2021, 5:16pm

So what is the threshold to trust your model ?

muellerzr · September 15, 2021, 5:22pm

There isn’t. Generally you take the argmax of these probabilities and that is your prediction. This is how fastai does it. Argmax = highest (before probability) index. Aka the logits.

So for example, my model doesn’t output things that sum to 1. Really they output what we call logits, which are a bunch of positive or negative numbers. In this example my model predicts three classes.

x,_ = dls.train.one_batch()

x will be a batch of data. I will then use raw pytorch to get the model logits:

with torch.no_grad():
  logits = learn.model(x)

These logits (on a batch of 1) may look like the following:

tensor([[0.4, 10.2, -20.]])

What we then do is perform softmax and argmax to translate this into something comperable.

Softmax:

softmax = logits.softmax(dim=-1)
tensor([[5.5449e-05, 9.9994e-01, 7.6609e-14]])

These all sum to 1, and can be called the “probabilities”

The official “class” that we say the image most represents is found by taking the argmax of that tensor (which doesn’t change when doing the logits or our softmax’d probabilities)

logits.argmax(dim=-1)
tensor([1])

So if our dls.vocab is something like ['bird', 'snake', 'dog'], we want the name in position 1, so our model classified the input as a snake.

Does this make more sense?

(This is also what predict and get_preds are doing under the hood)

Daguer · September 15, 2021, 6:32pm

thank you for your detailed answer.