Data augmentation question

Bliss · November 19, 2018, 5:34pm

Hi all,
When I do the transform with data augmentation:

tfms = tfms_from_model(resnet34, sz, aug_tfms=transforms_side_on, max_zoom=1.1)
data = ImageClassifierData.from_paths(PATH, tfms=tfms)

shouldn’t I see the value of data.trn_y.size increase? I thought my training set would be bigger as it would contain the original images + the augmented versions of those images?

Thanks!
Bliss

Bliss · November 20, 2018, 10:16am

After re-watching lesson 2 I think I may have my own answer:

When we do transformation with augmentation, for each image we pick one of the several alternatives of the augmentation (rotated, zoomed, flipped, etc).
On each epoch we will be picking different versions of each image so we are actually increasing the number of different images used for the training but the size of the training set does not vary from epoch to epoch.

Am I right?

Regards
Bliss

acs · December 16, 2018, 8:07pm

Seems like that to me.

To verify, I intentionally put a bug into the Transform.transform(...) method:

    def transform(self, x, y=None):
        x = self.do_transform(x,False)
        foo  # silly bug
        return (x, self.do_transform(y,True)) if y is not None else x

Stack trace goes up through BaseDataset.get1item(...) and DataLoader.get_batch(...).

Try adding print(...) statements in the code yourself and see if you agree.

Regards,

ACS