When and how image transformations are happening during image recognition NN training

When exactly augmentation Image transformations are happening during training NN for image recognition?
Does the library every time it chooses batch of images for training make one of the defined transformation to each image? So basically it makes training each time for different variation of inout data set.
Or does it happen somehow differently?

Basically I saw the answer in lesson 7 of part 1, where Jeremy explained that it is exactly what happening with Image transformations.