Hi, guys
I’m trying to implement a custom metric similar to this one. Basically, I want to compute the mean Matthew correlation coef. across multiple columns of the predictions. In total, I have 44 columns. The thing is, I want to ignore some values in these columns. For example, in the tensor below, I want to ignore any index where the value is -1.0
targs = tensor([-1,-1,-1,-1,1,0,0,0,1,-1])
So far, this is what I wrote:
def MaskedMatthewCorref2(preds, targ, thresh:float=0.5, sigmoid=True):
"Computes the MCC between `preds` and `targets`"
base = MatthewsCorrCoef()
mask = targ >= 0. # Ignore missing bioactivities
mccs = []
preds = preds.sigmoid()
for i in range(targ.size(1)):
mask = targ[:, i]>=0.0
preds = (preds >= thresh).float()
masked_targ = targ[:, i][mask]
masked_preds = preds[:, i][mask]
if masked_targ.size(0)!=0:
mccs.append(base(masked_preds, masked_targ))
return np.nanmean(mccs)
The problem is that the MCC during training does not match the value calculated during testing. I think the AccumMetric class can help me here, but I’m finding it difficult to implement my metric correctly.