This depends on the training data and class distribution. For example, in the kaggle carvana competition there were images of cars from multiple angles. So in this case you wouldn’t want to have the same car showing up in both the training & validation set. This can also be an issue with medical data if there are multiple images in the dataset belonging to the same patient, then the training and validation sets should (ideally) not contain images from the same patient.
The general rule of thumb is that you want your validation set to be as good a representation as possible of a test set (unlabeled data that the model hasn’t seen). So if your validation set contains images that your model has already seen then you aren’t really getting a good idea of how good your model will generalize to unseen/unlabeled data.
To be clear, though, ultimately you should run tests and empirically check the results. Whatever works best in practice isn’t always what you might have expected! Run lots of experiments and have fun with it