Advice for Invasive Species Kaggle Competition

If you precomputed and saved the features produced by 1 epoch with the ImageDataGenerator, and then used this fixed data to train the top layers, then you are not really fully using data augmentation. You basically just created one (or 4-6) random instance of data variation and you are using it over and over for all the following epochs. That is probably why you can overfit fairly easily the top layers with a high validation loss. There was a forum thread about this someday : Lesson 3, why can't we pre-compute when using augmantation?

Try to train the entire network with data generation and consequently without extracted precomputed features. The data generation will be random for each epoch. You can try to train only the top layers and then all the layers to see the difference.

I still don’t exactly understand why you couldn’t overfit when training all the layers. Did you use the image data generator for that full training as well ?