Lesson 3, why can't we pre-compute when using augmantation?

When using data augmentation, we can't pre-compute our convolutional layer features, since randomized changes are being made to every input image. That is, even if the training process sees the same image multiple times, each time it will have undergone different data augmentation, so the results of the convolutional layers will be different.

I don’t understand (or don’t agree).
I can create features from from augmented data, and get the labels from the augmented data (we know the labels).
And then use the features and labels to train the dense layers. Of course every augmented version of the same image will have different features, but I don’t see any problem with that.

Am I missing something ?

I think what Jeremy refer to is “online augmentation”, “online augmentation” try to do data augmentation on the input image every time before you feed the data into the network.Under this case the network will see different images in every epoch.

What you mention is “offline augmentation”, this kind of augmentation prefer to augment the data before you feed them into the network, the network will see the same images in every epoch.

offline augmentation:

train_img, train_label = read_img("train_folder")
train_img, train_label = augment_img(train_img, train_label)
//every epoch, the network see the same image
model.fit(train_img, train_label);

online augmentation:

data_generator = ImageDataGenerator(....)
train_data = data_generator.flow_from_directory(...)
//the network see different images every epoch

@tham has covered the difference between offline and online, but I thought I’d add the following which might help your understanding:

The point of data augmentation is to show the neural network slightly different images/data during each training epoch so that it’s robust to those variations. It helps prevent the model from overfitting. By precomputing the augmented data you’re showing your net the same data every epoch and in doing so you lose the main benefits of augmentation.

Thank you both for your answers, As I see it. The difference between inline and offline augmentation is how much memory I have.
Let’s say I have 10,000 images, And I want to train it for 10 epochs (with augmentation): This means the NN will see 100,000 different images (In inline augmentation).

I can’t achieve the same result in offline augmentation, just create from the 10,000 images , 100,000 images and train it for 1 epoch. The only problem is that I need enough memory (or to save it on disk).

Am I correct ?