Masked BCE loss

in sequential length based transformer model.
I am padding inputs with zeros, during loss calculation i need to exclude the padded portion of tensor . How can i achieve this efficiently . I know the effective length of each tensor say Seq length that i return from Dataset.

Seq length=torch.tensor([20,50,70])
 input =select[ -Seql length : ]
label=select[ -Seql length :]

if i use for loop then it may cause performance issues as my batch size is high