Hey guys,
I am currently making a text classifier and I am trying to figure out how to design it so that it doesn’t make wild guesses and returns blank if it is too unsure.
Here is my lang_learner
What params can I add to make it return blank it is not at all sure what the category is?
sorry if this is an armature question I am just trying to build a prototype application and my objective it primary to build an MVP and not to become an expert in ML
Fastais learners generally have two methods to predict: learn.predict() and learn.get_preds(), but both return probabilities, for example with:
results = learn.get_preds(dl=learn.dls.valid).
results[0] is a ‘number of perdicted instaces’ \times ‘number of classes’ matrix where each row holds the probabilites of all the classes for one instance, e.g. if probs = results[0], then probs[123,6] would hold the probability that instance 123 is of label 6. Usually the label with the highest value is chosen as the prediction.
You could use probs.max(axis=1) which not only returns the predicted labels (as the default .argmax would do) but also the probability with which those labels have been predicted. You can then get those instances that are lower than a threshold you pick and set the corresponding labels to “not good enough” or whatever… -1 maybe .
Make shure to keep on asking if something is unclear or I missed the point .