Ignore index in fastai NLP loss function

I have noticed that the default CrossEntropyFlat loss function when building a TextLMDataBunch does not ignore the padding index or xx fake index added by the numericalizer/tokenizer. Does anyone have any idea of why that is and whether it is import to modify the loss func to include this?

I am not sure how to modify the loss function to ignore multiple indices. I suspect one would want to ignore gradients due to PAD and XXFAKE, but the cross entropy loss in pytorch takes an int and not a list for ignore_index.

1 Like