Multiclass metrics that weight by class

Hello, I’m interested in defining a metric for a multi-class classification problem that allows one to take into account that some classes may be more important than others to detect correctly. Anyone know of any standard metrics that do this?

For example, suppose one has a multi-class problem with classes A, B, C. Suppose business requirements then dictate that detecting C correctly is 10x more important than detecting A or B. How should one set up a metric that can do this, irrespective of how much data one has for each class?

Ideally this could be done without having to completely define a metric from scratch on the confusion matrix, but I’ll take what I can get.

To be clear, I’m not talking about the micro/macro/weighted stuff that you see as options in sklearn type metrics. The metrics I’m thinking here shouldn’t be overly focused on how much data you have for each class, but instead on business requirements that some classes are just more important than others.

I know in the binary case one can use the F-beta as a proxy for this kind of thing. The problem is, extending the F-beta to the multi-class case generally doesn’t have quite the same effect, and falls into the same micro/macro/weighted stuff as mentioned above.

1 Like