Ensembling with different validation sets?

Deichastronaut · July 12, 2017, 8:57pm

Hi,

I am wondering whether it makes sense to use a different kind of ensembles. So far we used different runs of one model and one training/validation set to build an ensemble.
What about using different training/validation sets to build the ensemble? When we use a single set, we loose the information in the validation set for training. So, my thought is to mitigate this by using different sets every time and average over the different sets!?

Any thoughts?

Christian

pietz · July 13, 2017, 12:24pm

what would you use to verify the performance of this ensemble? you couldn’t use any of the data that was used to train the models in your ensemble.

simoneva · July 14, 2017, 10:47am

It is common to use cross-validation in sklearn where you would end up with a score for each fold and then average them. I am not sure why it is not used in deep learning but probably because of the time taken to train. 3 folds takes 3 times as long to train as a single train/valid split.

pietz · July 14, 2017, 11:07am

thats precisely the reason. its also why its often done when the datasets are small.