I hope this isn’t too pedantic, but it may be worth creating train/valid folders rather than a random split if some of these images come from the same photoshoot (i.e. the same person in the same location). You don’t want your model to recognize the person or setting in the image, just the hand keypoints.
I’m assuming you’re using this. If not, maybe your dataset doesn’t have this problem.