How do I get my list of predictions match the order of the images in my test folder?

Hi everyone,

I’ve followed the first part of the course and built a model to detect sleeves length in clothes.
I managed to run the model on my test folder and get labeled predictions in a list.

Problem is, it seems like the predictions comes out shuffled and so i cannot know which picture in the test folder the predictions refer to.

Let me explain this further. When i pass the first image (00000000.jpg) of the test folder into the model i basically get a different prediction than the first result displayed after running get_preds() on the whole test folder.

To get the prediction of only the first image i simply use:

file = '/content/drive/My Drive/fastai/test/00000000.jpg'
img = open_image(file)
pred = my_trained_mod.predict(img)
pred

result: 'sleeveless’

To get the predictions list from the whole test folder i use:

preds,_ = my_trained_mod.get_preds(ds_type=DatasetType.Test) 
labels = np.argmax(preds, 1)
test_predictions = [data.classes[int(x)] for x in labels]

Then i take only the first prediction:

test_predictions[0]
result = 'long sleeves’

As you can see, the first prediction on top of the list of predictions does not match the single prediction on ‘00000000.jpg’, which is the first picture in my test folder…

How do I somehow get my list of predictions match the order of the images in my test folder?

UPDATE
I was able to solve the issue since download_images() give an ascending number to images downloaded and it’s very helpful. In order to get the test predictions in order i used this code where 191 is the number of pics in my test folder (192)

path = '/content/drive/My Drive/fastai/test'
test_predictions = []
for i in range(0,191):
  if i < 10:
   file = '/content/drive/My Drive/fastai/IC2/test/0000000'+str(i)+'.jpg'
  elif i < 100:
   file = '/content/drive/My Drive/fastai/IC2/test/000000'+str(i)+'.jpg'
  else:
   file = '/content/drive/My Drive/fastai/IC2/test/00000'+str(i)+'.jpg'
  img = open_image(file)
  single_pred = my_trained_mod.predict(img)[0]
  test_predictions.append(single_pred)
  
test_predictions
1 Like

This was an earlier issue where you had to specify ordered=True in get_preds. Try upgrading your fastai version.

I think you could rephrase your question to “How can you refer back from a specific prediction to its source (e.g. file name)?”

I had a similar question a few months ago (How to review images for bad classifications) and I think I also got to the file name by just adding 1-2 extra lines of code.

Hi Kushaj, thanks for the reply.
I have fastai 1.0.54 , isn’t it the last version?
I am using it on google colab and i installed fasta with

!curl -s https://course.fast.ai/setup/colab

Hi Harverstind, thanks for the reply, you got my point. I tried your code, below I modified quickly your code by returning only the index and class on all predictions.

preds, y = my_trained_mod.get_preds(DatasetType.Test)
test_predictions = []
for i, (pred, gt_class) in enumerate(zip(preds, y)):
    pred_probability, predicted_class = torch.topk(pred, 1)
    test_predictions.append((i, predicted_class))

test_predictions

but by simply indexing the images in the test folder i get the same results as my code, just with an ascending index on the left of the predictions. Still the first image on my predictions is not the same as the first i have in my folder.

How do i get to catch the image name? :stuck_out_tongue:

He memen7omori,

I just checked my code and found this:

ds_idx, predicted_class, probability = bad_predictions[0]
fn = learner.data.valid_ds.x.items[ds_idx]

So yes the Image does not have a reference to its path but if you have the ds_idx (i in the loop above) you can query the DataBunch / the associated ImageList for the file name.

HTH

1 Like

ordered = True works only with Text classification predictions.

Has anyone ever figured out a solution to this? It seems kind of hacky to loop through all the images to maintain order in the test set. Doesn’t that kind of defeat the purpose of attaching a test set to the DataBunch?