Use CategoryList for binary classification，but the sum of preds !=1?

wajyjpku · August 5, 2019, 8:10am

hi,every body, now i meet with a problem in the learn.get_preds. i can’t get the sum of preds’ probabilities equal to 1.

The situation is this: i try to custom a loss function called focal loss to tabular learner:

and the dataloader is like this

the learner :

the trained result is fine:

but I can’t make the sum of the predicted probabilities equal to one !

it show that :

can any one help me ?? i think there should be a sigmoid like i use the BCE LOSS,and It automatically helps me normalize the probability

i have looking for the answer through the forum, like this :

simonjhb · August 10, 2019, 9:29pm

If you look at the definition of get_preds in the basic_train.py file: https://github.com/fastai/fastai/blob/master/fastai/basic_train.py#L333, you’ll see that the choice of a final activation function is made by either using the function you pass as the parameter activ or by using the helper function _loss_func2activ. The helper function doesn’t have a mapping from your focal loss to an activation function and so noop or the identity function is returned.

activ = ifnone(activ, _loss_func2activ(self.loss_func))

def _loss_func2activ(loss_func):
    if getattr(loss_func,'keywords',None):
        if not loss_func.keywords.get('log_input', True): return
    axis = getattr(loss_func, 'axis', -1)
    # flattened loss
    loss_func = getattr(loss_func, 'func', loss_func)
    # could have a partial inside flattened loss! Duplicate on purpose.
    loss_func = getattr(loss_func, 'func', loss_func)
    cls_name = camel2snake(loss_func.__class__.__name__)
    if cls_name == 'mix_up_loss':
        loss_func = loss_func.crit
        cls_name = camel2snake(loss_func.__class__.__name__)
    if cls_name in loss_func_name2activ:
        if cls_name == 'poisson_nll_loss' and (not getattr(loss_func, 'log_input', True)): return
        return _loss_func_name2activ(cls_name, axis)
    if getattr(loss_func,'__name__','') in loss_func_name2activ:
        return _loss_func_name2activ(loss_func.__name__, axis)
    return noop

Therefore when you call get_preds you’re getting the raw activations from the final layer of your network.

To rectify this you could either pass the parameter activ a softmax function or simply apply the softmax function yourself to the result of get_preds.

wajyjpku · August 11, 2019, 11:02am

thank you sir!
as far as i see, A PR has been post to the github to deal with this situation,thank you for your thanksgiving~
ps, the pr’s motivation from this url

wajyjpku · August 15, 2019, 6:52am

hi, thanks for your kindly reply!!!

i read 5 times to your reply, i also have the final question:

if i have to add a sofmax func to the get_preds like probabilities , when the model is trained, the return valid focal loss and metric are right as i using the default loss and metric?

because when i change the loss from BCE to Focal loss ,the valid F1 score from 0.4 to 0.6, it’s a big
advance ！ …i somehow doubt the result …

waiting for your reply, thank you again !!!

simonjhb · August 15, 2019, 7:48am

Hi,

You need to compare the definitions of the two losses to see what inputs they expect.

Your focal loss expects the activations from the last layer as inputs (before any softmax function). You can see this because F.log_softmax(input, dim=1) is called on the inputs in the forward function of your loss.

Fastai uses the pytorch the flattened version of CrossEntropyLoss by default - see the docs here: torch.nn — PyTorch 2.4 documentation.

This criterion combines nn.LogSoftmax() and nn.NLLLoss() in one single class.

So as far as I understand, CrossEntropyLoss also expects activations from the final layer.

Good luck!