Custom metric showing wrong value during training

marcossantana · February 7, 2021, 4:32am

Hi, guys

I’m trying to implement a custom metric similar to this one. Basically, I want to compute the mean Matthew correlation coef. across multiple columns of the predictions. In total, I have 44 columns. The thing is, I want to ignore some values in these columns. For example, in the tensor below, I want to ignore any index where the value is -1.0

targs = tensor([-1,-1,-1,-1,1,0,0,0,1,-1])

So far, this is what I wrote:

def MaskedMatthewCorref2(preds, targ, thresh:float=0.5, sigmoid=True):
    "Computes the MCC between `preds` and `targets`"
    base = MatthewsCorrCoef()
    mask = targ >= 0. # Ignore missing bioactivities
    mccs = []
 
    preds = preds.sigmoid()
    for i in range(targ.size(1)):
        mask = targ[:, i]>=0.0
    
        preds = (preds >= thresh).float()
        
        masked_targ = targ[:, i][mask]

        masked_preds = preds[:, i][mask]
        
        if masked_targ.size(0)!=0:

            mccs.append(base(masked_preds, masked_targ))


    return np.nanmean(mccs)

The problem is that the MCC during training does not match the value calculated during testing. I think the AccumMetric class can help me here, but I’m finding it difficult to implement my metric correctly.