Get_preds returning less results than length of original dataset

sgugger · January 5, 2019, 8:21am

The training dataloader drops the last batch if it doesn’t have bs elements. That’s because small batches lead to instability in training, particularly with BatchNorm. With a batch of size 1 we would even get an error from pytorch.

If you want all your training data, you can ask DataSet.Fix, which is the training set with shuffle=False and drop_last=False, ideal for evaluation.