Precision and recall metrics with ULMFit fastai v07

evh · June 26, 2019, 2:02pm

I’m trying to use the ULMFit pipeline on highly imbalanced data and so the accuracy metric is not relevant. When I try to fit the classification model with the metrics parameter as accuracy, I have no problem in training, but if I ask for precission, recall or both, I get the following error:

RuntimeError: inconsistent tensor size, expected r_ [48], ta [48] and tb [48 x 2] to have the same 
number of elements, but got 48, 48 and 96 elements respectively at /opt/conda/conda- 
bld/pytorch_1518244421288/work/torch/lib/TH/generic/THTensorMath.c:3051

I’m struggling to understand why this happens but have not found yet what is the problem. How can a change in metric produce such error and why?

msrdinesh · June 27, 2019, 3:29pm

If you are doing a single label classification, i.e, if you are just classifying your data into one of the N classes precision and recall are not appropriate metrics as they have to do with threshold setting for each class. The accuracy which you get is actually the average recall of all the classes in some sense. For such situations, the Kappa score is a more suitable measure to understand how your model is performing in spite of class imbalance.
One more thing, you can also try multi-label classification, in which each example can be predicted as one or more number of classes. There you can set up a threshold for each class and can calculate precision and Recall. For this, you just need to pass a list of categories as a target variable for each example. Fastai takes care of everything for you, and in this case, you can calculate Fbeta score by setting the required threshold. And also you can calculate precision and recall, by tweaking fbeta function here
For more details refer here