Discrepancy with proba-based metrics between fastai2 and sklearn

oguiza · May 5, 2020, 9:22pm

Ok, I’ve just finished testing. And have found a few (easy to solve) issues.

Binary: APScoreBinary and RocAucBinary both work as expected.
Multi-class: RocAuc works well too. But:
* labels=None as a kwarg is missing
* there’s a typo in the description :
It says: "Area Under the Receiver Operating Characteristic Curve for single-label multi-label classification problems”
when it should be "Area Under the Receiver Operating Characteristic Curve for single-label multi-class classification problems”
Multi-label is not working well because a thresh=0.5 has been added. But these are proba-based metrics that don’t require a thresh.

I’ve removed thresh and now they work well.

So they should be:

def RocAuc(axis=-1, average='macro', sample_weight=None, max_fpr=None, multi_class='ovr', labels=None):
    "Area Under the Receiver Operating Characteristic Curve for single-label multi-class classification problems"
    assert multi_class in ['ovr', 'ovo']
    return skm_to_fastai(skm.roc_auc_score, axis=axis, activation=ActivationType.Softmax, flatten=False, average=average, sample_weight=sample_weight, max_fpr=max_fpr, multi_class=multi_class, labels=labels)


def APScoreMulti(sigmoid=True, average='macro', pos_label=1, sample_weight=None):
    "Average Precision for multi-label classification problems"
    activation = ActivationType.Sigmoid if sigmoid else ActivationType.No
    return skm_to_fastai(skm.average_precision_score, activation=activation, flatten=False,
                         average=average, pos_label=pos_label, sample_weight=sample_weight)


def RocAucMulti(sigmoid=True, average='macro', sample_weight=None, max_fpr=None):
    "Area Under the Receiver Operating Characteristic Curve for multi-label binary classification problems"
    activation = ActivationType.Sigmoid if sigmoid else ActivationType.No
    return skm_to_fastai(skm.roc_auc_score, activation=activation, flatten=False,
                         average=average, sample_weight=sample_weight, max_fpr=max_fpr)