in sequential length based transformer model.
I am padding inputs with zeros, during loss calculation i need to exclude the padded portion of tensor . How can i achieve this efficiently . I know the effective length of each tensor say Seq length that i return from Dataset.
Seq length=torch.tensor([20,50,70]) input =select[ -Seql length : ] label=select[ -Seql length :]
if i use for loop then it may cause performance issues as my batch size is high