Hello, first post here, let’s see how this goes.
So I’m having trouble understanding a few things about accuracy threshold. First, Jeremy says that metrics are not a part of your model and that they don’t affect predictions. Now if you set your threshold as 0.2 that means that if the final probability for a particular class is higher than 0.2 the model will classify it has having that class right? Well doesn’t that affect your predictions then? Shouldn’t it be a parameter as well in that case? If it does, then how can you choose it?
What am I missing here?
Any help would be appreciated.
Not totally sure I understand your question but I’ll give it a shot.
The metrics are just things like accuracy or beta scores - basically they are a quick way to see how your model is performing catered to a certain judgement.
You are correct in the way that the threshold works. If the model finds an object with 55% certainty, it will classify the image as having the object since default threshold is 50%. So, changing this threshold really only changes the likelihood of classes being identified. For example, in one of my projects, my model usually has about a 99% chance of a class if it exists. Often, it will show a 80% chance of a class that doesn’t exist. To fix this, I changed the threshold to 90% in order to discard these false positives while maintaining accuracy for the correctly labeled classes.
But if it changes the likelihood of classes being identified it is a part of your model right? In that case could we parametrize the threshold and incorporate it to our model to improve accuracy? Why do we set it manually?
That is a fair point and I did not think of it in that way.
But I think the reason we set it manually is due to what we are looking for, based on the needs you have. Say for example I was doing medical research and was finding cancer cells based off of images. In a case like this, I would rather have false positives over false negatives - we don’t want to tell people who have cancer that they don’t, and if we tell people who don’t have cancer that they might, a quick checkup would solve the problem. Something like this would need a lower threshold.