Confusion about why setting 'shuffle=True' wouldn't work for the validation set since our validation directory still has sub-directories for each class

ryan_c · February 27, 2018, 5:02pm

At ~24:56 in “Lesson 3: Deep Learning 2018” , Jeremy mentions that “by default the validation set doesn’t get shuffled because if you shuffle your validation set you can’t track how well you’re doing since its in a different order than the labels.” However, I’m confused why this would be the case since within our validation directory, we still have the images separated in different sub-directories corresponding to their correct class (same as in the training directory…). I’m assuming Keras keeps track of the sub-directory it pulls each image from since it would need to do that in training regardless of the order that it feed the images to the model. If this is the case, then it would obviously do the same if we set ‘shuffle=True’ for the validation set as well, right?

Here’s the documentation related to the ‘shuffle’ argument in datagen.flow_from_directory. However, it doesn’t specify how the shuffling is actually performed.

Thanks, much appreciated!

wyquek · July 6, 2018, 2:04pm

I’m confused about that sentence too. Can’t find anyone on this forum or anywhere else who gave a plausible answer…