How to put data inside test set

naveenmanwani · November 29, 2017, 3:13pm

i’m little bit confuse on how should i put data in the test set .suppose i’m trying to build image classifier for pencil and pen .
1.first of all i made a folder named pencilpen
2.in pencilpen folder,i made three subfolder named test ,train ,valid
3.in train set i made two subfolder pencil and pen and put images to both folder respectively
4. in valid set i made two subfolder pencil and pen and put images to both folder respectively
5.then what should i put inside test set is it i have to put images of pencil and pen together in the test folder

please guide me ,correct me

thanks in advance

jeremy · November 29, 2017, 6:03pm

That sounds perfect!

naveenmanwani · November 29, 2017, 6:06pm

so jeremy,no subfolder of pencil and pen in the test set .in the test set i’ll put images of pencil and pen together.

ramesh · November 29, 2017, 9:11pm

Correct. You don’t know the labels to the Test data, so the library expects the images to be there without any labels or (in subfolders). You can think of Test Folder is very similar to Kaggle Submission (or test).

When evaluating, you can compare the True Labels with Test Predictions to score the model. But you should not tune the model based on this result, otherwise Test is no different from the Validation set and you might end up choosing a model that’s biased towards the Test set.

naveenmanwani · November 30, 2017, 12:52am

thank you for another simple and yet effective explanation.