Mixup for multiple classification heads?

hobbes99 · May 18, 2022, 6:56pm

I’m adding multiple classification heads to a resnet34 body (implemented like this tutorial: Lesson 6 - Multimodal Learning with Bengali.AI | walkwithfastai), but MixUp() is configured for the single classifier problem.

Can anyone offer a suggestion how to add MixUp to the mult-modal classification problem? MixUp has proved to be essential to avoid overfitting when I use only a single classification head …

hobbes99 · May 19, 2022, 6:47pm

OK, this turns out to be fairly easy since the loss function is a sum of cross-entropy loss functions we don’t have to explicitly compute the linear combinations of labels. Just add the attributed y_int. = True to the combined loss function.

One other change is required in the MixUp Class. The batch size is determined from the length of the label vector (y), but now we have a tuple of label vectors each of length batch_size so replace self.y.size(0) with self.y[0].size(0).