I am wondering whether it makes sense to use a different kind of ensembles. So far we used different runs of one model and one training/validation set to build an ensemble.
What about using different training/validation sets to build the ensemble? When we use a single set, we loose the information in the validation set for training. So, my thought is to mitigate this by using different sets every time and average over the different sets!?
what would you use to verify the performance of this ensemble? you couldn’t use any of the data that was used to train the models in your ensemble.
It is common to use cross-validation in sklearn where you would end up with a score for each fold and then average them. I am not sure why it is not used in deep learning but probably because of the time taken to train. 3 folds takes 3 times as long to train as a single train/valid split.
thats precisely the reason. its also why its often done when the datasets are small.