Batch prediction ordering bias

I’m working on multi-class classification problem using a resnet34 and focal loss/bceloss.

When calling learn.get_preds for a batch, the predictions for a particular item in the batch seem to vary depending on the ordering of the items in the batch. If I use a batch of size 512 with [256 class a, 256 class b] I’ll get a certain prediction, and then using [256 class b, 256 class a] the predictions for classes in the batch can vary by 0.0562.

e.g.
tensor([0.7136, 0.3771, 0.2724, 0.2008, 0.1377])
vs
tensor([0.6574, 0.3511, 0.2789, 0.2562, 0.1542])
for topk=5 predictions (the categories are identical)

Is this expected behaviour? What can cause this? As a result it seems that I get better validation results with a batch size of 1, and I wonder if it’s hampering my training too.