Predicting Test dataset in Fastai v2

satish860 · August 3, 2022, 8:37am

There’s a Kaggle task to estimate the class of plant seedlings, which I noticed while going through Fast AI 2022 session lesson 1. (Plant Seedlings Classification | Kaggle)

Training went well for me. To submit the submission file, I’m going to use the code below.

val = []
for file in get_image_files(test_path):
    result = {}
    result['file']=file.name
    preds,_,_= learn.predict(PILImage.create(file))
    result['species']=preds
    val.append(result)

This code runs over each file one by one and tries to guess the value. Is there a more efficient way to accomplish the same thing.

satish860 · August 3, 2022, 8:39am

Here is the complete source code. PlantSeedlingsComp/Train.ipynb at main · satish860/PlantSeedlingsComp · GitHub.

I appreciate all of your help.

bencoman · August 3, 2022, 8:45am

I’m not sure if its more efficient (perhaps you could experiment and report),
but here is an alternative method using get_preds().

satish860 · August 3, 2022, 12:00pm

Thank you, @bencoman. When compared to repeatedly looping through the same data, the above method has proved to be very efficient.

test_path = Path('data/test/')
test_dl = learn.dls.test_dl(get_image_files(test_path))
preds,idx,decoded = learn.get_preds(d,dl=test_dl,with_decoded=True)
val = []
i = 0
for fns in get_image_files(test_path):
    result={}
    result['file'] = fns.name
    result['species'] = learn.dls.vocab[int(decoded[i])]
    i=i+1;
    val.append(result)