Popular classification dataset validation and test set

I am currently working on replicating some image classification results that are published by people. Since most public dataset provides a train and test set, I wonder what are the accuracy figures in their paper actually represents? Do people :

Train their model on training set and use the test set as validation?


Split Training in train and validation set, then use test set as a final measure after all the training?

1 Like

If you see on platforms like Kaggle, there are two datasets given :

  1. Training dataset: As a typical practice, this dataset is split into training and validation set while training.

  2. Test dataset: This is public to all users, and the public leaderboard results are shown based on accuracy on this dataset. However, there is a private test dataset also, based on which the final standings of the leaderboard are decided.