I am currently working on replicating some image classification results that are published by people. Since most public dataset provides a train and test set, I wonder what are the accuracy figures in their paper actually represents? Do people :
Train their model on training set and use the test set as validation?
OR
Split Training in train and validation set, then use test set as a final measure after all the training?
If you see on platforms like Kaggle, there are two datasets given :
Training dataset: As a typical practice, this dataset is split into training and validation set while training.
Test dataset: This is public to all users, and the public leaderboard results are shown based on accuracy on this dataset. However, there is a private test dataset also, based on which the final standings of the leaderboard are decided.