For Keras fit method, does "shuffle=True" shuffle BOTH the training and validation samples or just the training dataset?

Consider this piece of code:

lm.fit(train_data, train_labels, epochs=2, validation_data=(val_data, val_labels), shuffle=True)

When using fit_generator with batches, each individual batch can be created with shuffle=True or False separately. But when using fit(), you don’t get the option to shuffle or not shuffle the validation set independent of the training set.

So my question is, when setting shuffle=True above, is only the training data getting shuffled OR is the validation data set getting shuffled as well?

Just the training data.

2 Likes

Cool. Thanks for the reply!

In the keras documentation, I didn’t see fit_generator has an option to shuffle. If it’s possible, can someone show an example how? Thanks.

The shuffle argument is available when creating the batches using an image.ImageDataGenerator object.

See the documentation forflow() and flow_from_directory() here: https://keras.io/preprocessing/image/

Just so I can test my understanding, isn’t it irrelevant if the validation data is being shuffled, since a) it’s not adjusting any weights stochastically using the validation data, and b) the accuracy number should be the same regardless of the order the validation set is tested in? Is this correct?

1 Like

Yes.

The validation set is just being used how well the trained model works on examples it hasn’t seen during training, and so it being shuffled is irrelevant.

3 Likes

The validation data is used for optimizing parameters used for training though shuffling is irrelevant here.

Hi, thanks for the post.
If I split my data to train, validation and test. train and validation are used for training where validation is a specific dataset (not cross validated).
Test is used for model performance evaluation.

Do you mean that the shuffle should be done on both training and validation set?

Thanks,
eilalan

How can we use shuffle =true method in fastai learn.fit and learner.fit_one_cycle?
All suggestions are welcome.
Thanking you,
Harshit