I’ve been working to build a tabular data model for a binary classification and wanted to use recall as one of the metrics to track, however, after looking through the source code I’m unable to understand how these classes are going about setting a probability threshold. They all are inheriting from the ConfusionMatrix() class, so I’m assuming I’m missing something here. I was hoping that someone might be able to provide a little insight into the inner workings of these classes.
Bit late to the party, but having the exact same issue, wanting to adjust the threshold for a binary classifier with precision as the metric in tabular data.
Precision and all metrics inheriting from ConfusionMatrix are for single classification problems, so they don’t take a threshold: the predicted class is the one with the highest probability.
Hmm my data set is imbalanced(90%=0, 10%=1) to start with, so it’s predicting the negative class everytime:
So accuracy is around 90% because it’s predicting the negative class every time, but precision is NaN because it doesn’t have any positive predictions, thus why I want to change this threshold from 50/50 to say 80/20.
Hey @nojeffrey, I managed to handle the imbalance using FocalLoss() as a loss function. The new model significantly improved the precision, recall and f1 score on a comparable test set.
This might help: