I have a binary classification problem where the penalty incurred for a wrong prediction varies (basically linearly) with the value of one of the continuous features. The higher this feature, the higher the penalty for a wrong prediction

Is there a common way to bake this into a model? I’m thinking this requires a custom loss function but am not sure where to begin research. Is there a buzzword I should search, an easier way to include this idea, or academic papers that explore this?

Hi, I’m not an expert on this so take my answer with care,
but if I should give a penalty linearly to my loss function, I could simply try to do it to add this value multiplied with my penalty weight (and I would bake the scale into the weight to approximately match the loss magnitude which depends on the particular loss and on the particular task), so
Loss = Loss_original + penalty_weight * your_continuous_variable_value