Precision & Recall: understanding averages

I’d like to understand whether/how different averages for “precision” and “recall” are calculated.

I am classifying a single label with multiple classes. When I set averages to “'macro'”, scores seem to make sense:

epoch train_loss valid_loss accuracy precision recall fbeta time
19 0.655477 0.773050 0.761228 0.767700 0.672442 0.689554 03:15

However my classes are imbalanced, and as far as I understand I should use “'micro'” averages to account for this imbalance. If I switch to “'micro'” though:

learn.metrics=[accuracy,
               Precision(average='micro'),
               Recall(average='micro'),
               FBeta(average='micro')]

all 4 scores (including accuracy) are the same:

epoch train_loss valid_loss accuracy precision recall fbeta time
5 1.240163 0.991491 0.688006 0.688006 0.688006 0.688006 03:15

(same problem with “'weighted'” averages (recall always == accuracy))

Is that expected?


edit after some reading:

according to this it looks like precision, recall and f score are indeed the same for “'micro'” averages.

So the remaining question is are all these numbers the same as accuracy?
Also is “recall” really the same as accuracy in the case of “'weighted'” averages?

1 Like

Let’s take a look
The Precision is the fraction of correct classifications over all cases where you predicted that class:

Precision = #correct A's / #A predictions

if you do a micro average you will end up at

Precision = (#correctA + #correctB + ...)  / (#predA + #predB + ...)
          =           #correct             /        #samples          (Accuracy)

and you can do the very same starting from Recall.

If you do macro, you will in general not see those being equal for Precision

Precision = #correctA / #predA  +  #correctB / #predB  +  ...

(can’t really do much here)

Now what happens with weights in Recall where the weight is #A / #samples, ..., and Recall is #correctA / #A, ...

Recall = (#A / #samples) * #correctA / #A  +  (#B / #samples) * #correctB 
       = #correctA / #samples  +  #correctB / #samples  +  ...
       = (#correctA + #correctB + ...)  #samples     (Accuracy)

There we go already :smiley:

2 Likes

great, that’s helpful. thank you Dominik