I’m writing a custom loss function to deal with a multitask text classification. In my problem, there are 44 classes and each sample can be classified as “1” or “0”. However, not all samples have a label on all classes. For instance, sample1 might have a “1” label on class1 and empty entries (I assigned a -1 to these entries) on all others. Therefore, my goal is to write a loss function that can ignore missing labels and focus only on the 1’s and 0’s. Here’s what I wrote so far:
def masked_BCE(inputs,targets): inputs = inputs.sigmoid() mask = targets >= 0.0 inputs = inputs[mask] targets = targets[mask] return -torch.where(targets==1, inputs, 1-inputs).log().mean()
This function masks missing entries (-1 value) and calculate the loss only where there’s a 1 or 0 in the target vector. When I try it on the output of the model using:
output = learn.model(x) masked_BCE(output, y)
TensorMultiCategory(0.7127, device='cuda:0', grad_fn=<AliasBackward>)
It runs just fine. However, when I try training the model I get this error:
TypeError: unsupported operand type(s) for +=: 'TensorMultiCategory' and 'TensorText'
Anybody trying something similar? I couldn’t find out why this error is happening.
I modified the function to use the BCEWithLogits class from Pytorch and SURPRISE! The model is training now. But I have no idea why, since it’s basically the same thing:
def masked_BCEWithLogits(inputs,targets): criterion = BCEWithLogitsLossFlat() mask = targets >= 0.0 inputs = inputs[mask] targets = targets[mask] return criterion(inputs, targets)