I trained convnets with different architetures and validated using 5-fold cross-validation. After this, I trained the final version of the convnets using the full data set (training + validation splits) and evaluated the performance on an external test set (i.e., not used before).
The thing is the performance on the test set was higher (~10%) than the cross-validation performance. This happened for all architetures and even for a Random forest model I used as base line. I believe this is because I trained on much more data.
Do you think this is the case? And how can I make sure things are OK with my validation?