Data augmentation question

Hi all,
When I do the transform with data augmentation:

tfms = tfms_from_model(resnet34, sz, aug_tfms=transforms_side_on, max_zoom=1.1)
data = ImageClassifierData.from_paths(PATH, tfms=tfms)

shouldn’t I see the value of data.trn_y.size increase? I thought my training set would be bigger as it would contain the original images + the augmented versions of those images?


After re-watching lesson 2 I think I may have my own answer:

When we do transformation with augmentation, for each image we pick one of the several alternatives of the augmentation (rotated, zoomed, flipped, etc).
On each epoch we will be picking different versions of each image so we are actually increasing the number of different images used for the training but the size of the training set does not vary from epoch to epoch.

Am I right?


Seems like that to me.

To verify, I intentionally put a bug into the Transform.transform(...) method:

    def transform(self, x, y=None):
        x = self.do_transform(x,False)
        foo  # silly bug
        return (x, self.do_transform(y,True)) if y is not None else x

Stack trace goes up through BaseDataset.get1item(...) and DataLoader.get_batch(...).

Try adding print(...) statements in the code yourself and see if you agree.