Why is the accuracy threshold so low in lesson 3 Planet kaggle exercise?

In the planet exercise https://github.com/fastai/course-v3/blob/master/nbs/dl1/lesson3-planet.ipynb the accuracy_thresh and fbeta is set to 0.2 and I thought that this would be bad since we’re saying, if something has a probability greater than .2 in the output node(17 output nodes), then we want to say it has that class(the output node counts), hence we include that class in our label of 17.

Doesn’t that give room for less certainty? I mean, if we train our neural net to be able to say only if something is greater than 90% or .9 certainty, say it contains that class(that output node counts), wouldn’t that be better since the ones that are less certain can be way less than .9 like .2 instead of being something like .21(counts), .19(doesn’t count)?

Maybe I am misunderstanding the use of the accuracy threshold and it’s simply for metrics, because Jeremy did mention that the metrics don’t play a part in the training or the actual prediction.

Can someone confirm that this is the case? Because I believe so, looking at the results from running the model with thresh=0.2 and thresh=0.7 which gave very very similar results for both train_loss and valid_loss. However, I just want to confirm. Also, how do we know if that we are doing good on kaggle by their scoring(F2 scoring) if we are setting our own measurements for accuracy and fbeta. How can we tell that it is similar to how kaggle scores it?

1 Like

I may be wrong about this but I think that the fbeta metric is computed on the logits, meaning the outputs of the network before getting through the final activation function.

This then would make more sense as sigmoid(0.2) \approx 0.55. So by getting a threshold of 0.2, you get rid of any prediction where your model isn’t more than 55% confident about the prediction.

Also, from the planet module, you can import the opt_th function, that will compute the threshold that will maximize your accuracy.

For example:

preds, _ = learn.TTA()
acc = f2(preds, y)
th = opt_th(preds, y)

May return a threshold value slightly different than 0.2.


Great to know about the logits piece