Is there a method for calculating the best accuracy threshold for multi-label classification? (Lesson 3)

olaf.goerlitz · July 31, 2020, 10:31pm

it depends on what you mean by “best threshold”. Do you want to have less false negatives or less false positives? Actually, for multi-label classification it is not so clear what accuracy, precision, recall, or false negatives/positives etc. means because you can have partially correct predictions, e.g. if only one of two labels was predicted correctly.

While I was trying to figure out myself how to calculate the accuracy correctly and which threshold to use I got the impression that this accuracy_thres implementation is not really correct for multi-label problems. In fact the documentation says it is for one-hot encoded target but multi-label is not one-hot encoded.

Basically you can use different ways to calculate the accuracy for multi-label, e.g. with the Exact Match Ratio (Subset Accuracy) or considering each predicted label individually. But none of this seems to be implemented in FastAI v1.

There was actually a suggestion on how to implement a better accuracy calculation for multi-label here A different variant of accuracy_thresh but it seems there was never a pull request made.

I also found some other threads with similar questions but none of them was satisfying for me.

Finally, it also seems that the notebook has an error. According to the post No longer able to reproduce fastprogress's accuracy_thresh in v1.0.24 accuracy_thres should be called with parameter sigmoid=False but by default it is set to True.