Discrepancy with proba-based metrics between fastai2 and sklearn

I’ve made a tentative update. Let me know if you get any problem with it.

1 Like

Great!
I’ll test it right away, and will get back to you.

Ok, I’ve just finished testing. And have found a few (easy to solve) issues.

  • Binary: APScoreBinary and RocAucBinary both work as expected.

  • Multi-class: RocAuc works well too. But:
    * labels=None as a kwarg is missing
    * there’s a typo in the description :
    It says: "Area Under the Receiver Operating Characteristic Curve for single-label multi-label classification problems”
    when it should be "Area Under the Receiver Operating Characteristic Curve for single-label multi-class classification problems”

  • Multi-label is not working well because a thresh=0.5 has been added. But these are proba-based metrics that don’t require a thresh.

I’ve removed thresh and now they work well.

So they should be:

def RocAuc(axis=-1, average='macro', sample_weight=None, max_fpr=None, multi_class='ovr', labels=None):
    "Area Under the Receiver Operating Characteristic Curve for single-label multi-class classification problems"
    assert multi_class in ['ovr', 'ovo']
    return skm_to_fastai(skm.roc_auc_score, axis=axis, activation=ActivationType.Softmax, flatten=False, average=average, sample_weight=sample_weight, max_fpr=max_fpr, multi_class=multi_class, labels=labels)


def APScoreMulti(sigmoid=True, average='macro', pos_label=1, sample_weight=None):
    "Average Precision for multi-label classification problems"
    activation = ActivationType.Sigmoid if sigmoid else ActivationType.No
    return skm_to_fastai(skm.average_precision_score, activation=activation, flatten=False,
                         average=average, pos_label=pos_label, sample_weight=sample_weight)


def RocAucMulti(sigmoid=True, average='macro', sample_weight=None, max_fpr=None):
    "Area Under the Receiver Operating Characteristic Curve for multi-label binary classification problems"
    activation = ActivationType.Sigmoid if sigmoid else ActivationType.No
    return skm_to_fastai(skm.roc_auc_score, activation=activation, flatten=False,
                         average=average, sample_weight=sample_weight, max_fpr=max_fpr)
3 Likes

Thanks for investigating all of this. I removed the thresh and fixed the typo.

Great!
I’ve retested again and everything works smoothly now :ok_hand:
So from my side we can close this.
THANKS a lot @FraPochetti and @sgugger for your work to fix this issue. It’s been a pleasure working with you!

4 Likes

If you have class = {0,1} and you want to use RocAUc
class{0,1} are complementary like cats & dogs.

learn = cnn_learner(dls, resnet34, metrics=[accuracy])
learn.fine_tune(1)

What’s the best way to invoke it?
I don’t see examples here

https://dev.fast.ai/metrics#RocAuc

Hi Gerardo,
Sorry for the late reply, but I was out last week.

  1. You should select the appropriate metric:
    • RocAucBinary: for single-label binary
    • RocAuc: for single-label multi-class
    • RocAucMulti/ APSMulti: for multi-label
  2. In your case (binary classification):
    learn = cnn_learner(dls, resnet34, metrics=[accuracy, RocAucBinary()])
1 Like

@oguiza You are always super helpful :100:

What is the purpose of the axis?

RocAucBinary ( axis = -1 , average = 'macro' , sample_weight = None , max_fpr = None , multi_class = 'raise' )

Thanks @gerardo.
axis is just a value used by skm_to_fastai, but AFAIK it doesn’t need to be changed for any of the RocAuc variants.

I’m using RocAuc() for the US Income dataset something like this

    to = TabularPandas(df, procs=[Categorify, FillMissing,Normalize],
                        cat_names = ['workclass', 'education', 'marital.status', 'occupation', 'relationship', 'race','sex','native.country'],
                       cont_names = ["fnlwgt","capital.gain","capital.loss","hours.per.week","age"],
                       y_names='income',
                       splits=splits)
dls = to.dataloaders(bs=64)
learn = tabular_learner(dls, metrics=[RocAuc()])

It’s giving me an error like this

AttributeError                            Traceback (most recent call last)
<ipython-input-59-8fd4c17131d7> in <module>
----> 1 learn = tabular_learner(dls, metrics=[RocAuc()])

<ipython-input-36-803d668dbfd6> in RocAuc(axis, average, sample_weight, max_fpr, multi_class, labels)
     56     "Area Under the Receiver Operating Characteristic Curve for single-label multi-class classification problems"
     57     return skm_to_fastai(skm.roc_auc_score, axis=axis, flatten=False, softmax=True, proba=True,
---> 58                          average=average, sample_weight=sample_weight, max_fpr=max_fpr, multi_class=multi_class, labels=labels)
     59 
     60 def RocAucMulti(axis=-1, average='macro', sample_weight=None, max_fpr=None):

<ipython-input-36-803d668dbfd6> in skm_to_fastai(func, is_class, thresh, axis, sigmoid, softmax, proba, **kwargs)
     36     sigmoid = sigmoid if sigmoid is not None else (is_class and thresh is not None)
     37     return AccumMetric(func, dim_argmax=dim_argmax, sigmoid=sigmoid, softmax=softmax, proba=proba, thresh=thresh,
---> 38                        to_np=True, invert_arg=True, **kwargs)
     39 
     40 def APScore(axis=-1, average='macro', pos_label=1, sample_weight=None):

<ipython-input-36-803d668dbfd6> in __init__(self, func, dim_argmax, sigmoid, softmax, proba, thresh, to_np, invert_arg, flatten, **kwargs)
      3     def __init__(self, func, dim_argmax=None, sigmoid=False, softmax=False, proba=False, thresh=None, to_np=False, invert_arg=False,
      4                  flatten=True, **kwargs):
----> 5         store_attr(self,'func,dim_argmax,sigmoid,softmax,proba,thresh,flatten')
      6         self.to_np,self.invert_args,self.kwargs = to_np,invert_arg,kwargs
      7 

/opt/conda/lib/python3.7/site-packages/fastcore/basics.py in store_attr(names, self, but, cast, **attrs)
    275     if self: args = ('self', *args)
    276     else: self = fr.f_locals[args[0]]
--> 277     if not hasattr(self, '__stored_args__'): self.__stored_args__ = {}
    278     anno = annotations(self) if cast else {}
    279     if not attrs:

AttributeError: 'str' object has no attribute '__stored_args__'

Can someone please guide me where I’m going wrong??

The answer is explained above.

  • RocAuc: for single-label multi-class
  • RocAucBinary or BinaryRocAuc/ APScoreBinary or BinaryAPScore: for single-label binary
  • RocAucMulti/ APSMulti: for multi-label

you may try the correct one like RocAucMulti and see the docs.

1 Like

Thank you so much dear oguiza
When I use:
def RocAuc(axis=-1, average=‘macro’, sample_weight=None, max_fpr=None, multi_class=‘ovr’, labels=None):
“Area Under the Receiver Operating Characteristic Curve for single-label multi-class classification problems”
assert multi_class in [‘ovr’, ‘ovo’]
return skm_to_fastai(skm.roc_auc_score, axis=axis, activation=ActivationType.Softmax, flatten=False, average=average, sample_weight=sample_weight, max_fpr=max_fpr, multi_class=multi_class, labels=labels)


The error: "NameError: name 'ActivationType' is not defined" appeared,
What should I add to my model?
thanks