In the multi-label classification example - Amazon Planet - Jeremy has considered a threshold of 0.2 for the accuracy threshold. Is there a method for finding this value mathematically for different models, or is it just by trying different values and pick the best one?
You can try out different values and pick the best one.
Even better, you can create a graph to visualise it (this snippet comes from the course-v4 notebook) so it’s easy to pick the threshold which gives the highest accuracy.
Just stick the following code into a new cell:
preds, targs = learn.get_preds(DatasetType.Valid) xs = torch.linspace(0.05,0.95,29) accs = [accuracy_thresh(preds, targs, thresh=i, sigmoid=False) for i in xs] plt.plot(xs,accs);
Hope this helps.
it depends on what you mean by “best threshold”. Do you want to have less false negatives or less false positives? Actually, for multi-label classification it is not so clear what accuracy, precision, recall, or false negatives/positives etc. means because you can have partially correct predictions, e.g. if only one of two labels was predicted correctly.
While I was trying to figure out myself how to calculate the accuracy correctly and which threshold to use I got the impression that this
accuracy_thres implementation is not really correct for multi-label problems. In fact the documentation says it is for one-hot encoded target but multi-label is not one-hot encoded.
Basically you can use different ways to calculate the accuracy for multi-label, e.g. with the Exact Match Ratio (Subset Accuracy) or considering each predicted label individually. But none of this seems to be implemented in FastAI v1.
There was actually a suggestion on how to implement a better accuracy calculation for multi-label here A different variant of accuracy_thresh but it seems there was never a pull request made.
I also found some other threads with similar questions but none of them was satisfying for me.
- Why is the accuracy threshold so low in lesson 3 Planet kaggle exercise?
- Questions about accuracy threshold in lesson 3 planet exercise
Finally, it also seems that the notebook has an error. According to the post No longer able to reproduce fastprogress's accuracy_thresh in v1.0.24
accuracy_thres should be called with parameter
sigmoid=False but by default it is set to