Custom loss for decision making

Hey guys,

Getting back to learning neural networks recently.

I am trying to make a neural network that has to take a decision. At any given hour, I want to predict if the blue number is going to be greater than the orange number:

If the network predict correctly the sign of the difference, it reaps a reward equal to the difference. If it is wrong though, it would get punished by a penalty equal to the difference. For example if orange = 5 and blue = 10 and you predict Positive, you would get +5\$ as a reward since blue - orange == 5. If you predicted Negative though you would get -5\$. Plus thereâ€™s fixed costs for doing a prediction, 0.2\$ for a positive prediction, 1.2\$ for a negative prediction.

I would also like the network to be able to decide to do nothing if it is not certain enough. The cost of being wrong is pretty steep and also each time you do a prediction thereâ€™s a fixed cost. So ideally I would prefer not to do anything if the network is not sure enough.

The network must choose between three choices:

• Do nothing
• Bet negative
• Bet positive

I want to make a custom loss function that take this into account.

I have a tabular model that outputs 3 outputs. I pass that to softmax, which gives me a tensor of probabilities the model assign to each choiceâ€¦

Here is my first crack at it:

``````class ProfitLoss(Module):
def forward(self, out, targs):
# out is a tensor [batch, 3] where [:,0] == do nothing, [:,1] == Bet negative, [:,2] == Bet Positive
# we do softmax on it, which returns probability for each choice that sums to 1
out = torch.softmax(out, dim=1)

# targs is the difference between orange and blue and it is the target in my dataloader
# we multiply it by [0,-1,1], so if target == 500, then targs becomes [0, -500, 500]
targs = targs * torch.tensor([0, -1, 1], device='cuda')

# we multiply it by the probabilities, let's say [0.2, 0.6, 0.2] * [0, -500, 500] = [0,-300, 100]
loss = out * targs

# We add our fixed cost per prediction, 1.2\$ for negative, 0.2\$ for positive and 0\$ for doing nothing
loss += torch.tensor([0, -1.2, -0.2], device='cuda')

# I sum on the first axis, [0, -300, 100] = [-200] so I would lose 200\$ here
loss = loss.sum(axis=1)

# We make this negative because we are calculating gains, but we want to minimize the loss
return -torch.mean(loss)
``````

How could I make this better? My gut feeling is that multiplying it by the tensor [0, -1, 1] probably cause gradient problems in the first position because we are multiplying by zeroâ€¦

Any pointers to literature that could be interesting to learn more or any tips would be appreciated!

Thanks a lot!

1 Like

Really great first formulation of the loss. I would like to know if you were further able to find any literature or do some research related to this.

I would think one thing which could help is normalization of the targets (i.e. blue - orange) to be in the range of [0,1] or standardization might help since using values with O(2) magnitude might cause problems in backpropagationâ€¦

Thanks,
Vinayak.

1 Like

I was not really able to figure out this problem, but found some relevant literature to penalize the neural network more if it predicts something in the wrong sign direction:

1 Like

For anyone interested I found papers related to this. I didnâ€™t know but the right term to search google is â€ślearning to abstainâ€ť. When the network is uncertain, it sends a signal (using various mechanism depending on the paper) that it it too uncertain to predict.

One way to mitigate this could be to use a differentiable approximation of the step function, like the sigmoid or tanh function, to map your predictions to probabilities. This could help smooth out the gradients and improve training stability. Also, consider experimenting with different cost structures and weighting schemes to see what works best for your specific problem. As for literature, â€śDeep Learningâ€ť by Goodfellow et al. is a great resource for diving deeper into neural network architectures and loss functions. Keep tinkering and iterating, and youâ€™ll find the right balance! And hey, if youâ€™re ever stuck, you can always roll a d20 and let randomness inspire your next move.