Can label smoothing be used for multi-label images?

Thanks so much for your response!

It seems based on this post it was possible to try label smoothing with multi-label:

However, he was saying it did not add to 1, which seems to be important to match it up with the probabilities.

Would it make sense to set the labels to \frac{1-\epsilon}{n} for those labeled 1 and \frac{\epsilon}{N-n} for those labeled 0 where n is the number of positive labels per data point?

In terms of loss for each data point with n labels that are one-hot encoded, it would be:

\sum_i\frac{(1-\frac{N-n}{N}\epsilon)}{n}(-\log(p_i)) + \sum_{j \neq i} \frac{\epsilon}{N}(-\log(p_j))

where i are the positive labels.

Does this seem correct?

1 Like