Yes, we changed the default drop_last to True for the training set because small batches (especially batches of size 1) make BatchNorm layer bug (that is the recommendation from pytorch). Guessing it’s making ordered=True bug yes.
A workaround it to use fix_dl for getting the predictions on the training set (which is the same as train_dl minus transforms (in vision) and with shuffle=False, drop_last=False).