Does/can LabelSmoothingCrossEntropy support an "ignore_index"?

wgpubs · March 18, 2020, 10:54pm

Working on integrating a seq-to-seq model training into v2 and would love to be able to pass an ignore_index that could be used to tell LabelSmoothingCrossEntropy to not look at certain token ids when calculating the loss (e.g., ignore any -1 token ids) when comparing the generated text to the actual.

In v1, not knowing any better, I just created my own class to create this one modification in the forward pass:

def forward(self, output, target):
        c = output.size()[-1]
        output = output[target != self.ignore_index]
        target = target[target != self.ignore_index]
        
        log_preds = F.log_softmax(output, dim=-1)
...

Is there a better way in v2?

sgugger · March 18, 2020, 11:19pm

No it does not support this. You can create your custom loss function, as you mentioned.

wgpubs · March 18, 2020, 11:36pm

Ok thanks.

Is there a way to access the Learner from the loss function? Or would I need to pass in the Learner as an argument when creating it?

sgugger · March 18, 2020, 11:42pm

No, but a callback naturally has the Learner as an attribute and can modify the loss.

wgpubs · March 18, 2020, 11:46pm

Ok that makes sense.

And one last kinda related question … when I use splitter to create my own “layer groups”, how can I display those layer groups in v2? Tried len(learn.layer_groups) but that looks deprecated now.

sgugger · March 18, 2020, 11:54pm

THey are in the optimizer: learn.opt (after you create it if needed, with learn.create_opt)

wgpubs · March 18, 2020, 11:54pm

Aight … thanks much and much appreciated. That is all my questions for the day (most likely)