Is there a method for calculating the best accuracy threshold for multi-label classification? (Lesson 3)

In the multi-label classification example - Amazon Planet - Jeremy has considered a threshold of 0.2 for the accuracy threshold. Is there a method for finding this value mathematically for different models, or is it just by trying different values and pick the best one?

Hi @babak.jfard,

You can try out different values and pick the best one.

Even better, you can create a graph to visualise it (this snippet comes from the course-v4 notebook) so it’s easy to pick the threshold which gives the highest accuracy.

Just stick the following code into a new cell:

preds, targs = learn.get_preds(DatasetType.Valid)

xs = torch.linspace(0.05,0.95,29)
accs = [accuracy_thresh(preds, targs, thresh=i, sigmoid=False) for i in xs]

Hope this helps.


1 Like

Hey @babak.jfard,

it depends on what you mean by “best threshold”. Do you want to have less false negatives or less false positives? Actually, for multi-label classification it is not so clear what accuracy, precision, recall, or false negatives/positives etc. means because you can have partially correct predictions, e.g. if only one of two labels was predicted correctly.

While I was trying to figure out myself how to calculate the accuracy correctly and which threshold to use I got the impression that this accuracy_thres implementation is not really correct for multi-label problems. In fact the documentation says it is for one-hot encoded target but multi-label is not one-hot encoded.

Basically you can use different ways to calculate the accuracy for multi-label, e.g. with the Exact Match Ratio (Subset Accuracy) or considering each predicted label individually. But none of this seems to be implemented in FastAI v1.

There was actually a suggestion on how to implement a better accuracy calculation for multi-label here A different variant of accuracy_thresh but it seems there was never a pull request made.

I also found some other threads with similar questions but none of them was satisfying for me.

Finally, it also seems that the notebook has an error. According to the post No longer able to reproduce fastprogress's accuracy_thresh in v1.0.24 accuracy_thres should be called with parameter sigmoid=False but by default it is set to True.