I have a project where 99% recall is needed in order for the end-result to be useful. Of course, for any mildly difficult classification task, you’re going to have a lot of false positives if you are demanding 99% recall. The goal is to reduce as far as possible the number of false positives.
The way I’ve been operating so far, I simply train a model as normal, and then set the confidence threshold of when I decide something is in or out of the class we are looking for to something other than 0.5. I set the confidence threshold as low as necessary until I reach the point where we hit .99 recall. This doesn’t work terribly well though, and there are still lots of false positives. I’ve also noticed that I can make the model better on the test set if the threshold is set at .5, but that better model actually does worse when the threshold is set such that I attain .99 recall. I think this is because the model becomes more confident in its classifications, including its incorrect classifications, as I train it more. A highly confident incorrect classification does worse when you crank the threshold to get to .99 recall, even though accuracy is better at the .5 threshold.
Are there best practices for dealing with this type of situation? Loss functions or types of models that work well here?