I’m updating some old code, and I’m taking the chance to look more closely into all details of the pipeline.
I’ve noticed that the number of train batches seems to be the minimum number of batches, with no repetition of items (that is, if we have 5 images and batch size is 2, then we have 2 batches and 1 image seems to be left out), while for validation it seems to be the minimum number of batches that include all examples at least once (that is, if we have 3 images and batch size is 2, we have 2 batches).
In a basic test with 5 train images and 3 validation images, I have:
Batch size 1: 5 train batches, 3 val batches
Batch size 2: 2 train batches, 2 val batches
Batch size 3: 1 train batch, 1 val batch
I’ve also tried to look into the batches one by one using one_batch
, but train seems to deliver them randomly, while validation always returns the first one.
Is all of this correct? Did I misinterpret anything?