ImageDataBunch not loading test data in order

I have folder full of images and for testing i m giving path to ImageDataBunch
data = ImageDataBunch.from_folder (' ./data/', ds_tfms=get_transforms(), test='./test/', size=224, num_workers=10)

I am getting good accuracy but when i submit the predictions i got really low accuracy as my test data set that is loaded is not in order nd hence giving predictions on
p = learn_cnn.get_preds(ds_type=DatasetType.Test)

I debug and check that images in folder and images in test data set are not in same order
i checked that by

data.test_ds[4][0]

And then i go to my folder and verify that 4(th) image ( keeping in mind of 0th element ) are not same

so when i

p = learn_cnn.get_preds(ds_type=DatasetType.Test)
This will obviously predict for different order
So how can i load test data in order for folder ?

in other words
Why this ? test_df.iloc[4] is not same as data.test_ds[4][0]

Hi, for some reason the ImageDataBunch.from_folder() method doesn’t have presort parameter, but for example in the data_block.py file the get_files() method has presort param which is False by default, and this get_files() method used in the ImageList.from_folder() which is used by ImageDataBunch.from_folder() inside.

I think you can use ImageDataBunch.from_df() if you created a df anyway to test this which is in the right order for your submission as you tested :slight_smile:
(or if it’s not okey, then you can use somehow that get_files() with some custom coding and use its presort)

1 Like

I just loop over data by learn.pred. but thankyou will try that next time

What about some like

class NoRandDL(DataLoader):
        def randomize(self): self.rng = 3 # random.Random(self.rng.randint(0,2**32-1))