Sort order of predict()?

wnurmi · March 12, 2018, 10:16pm

Hey guys! Thanks for the awesome community. I’m trying to apply lesson 2 multi-label prediction to google’s landmark dataset and have run across a silly problem when trying to generate the submission to kaggle. I’m getting decent accuracy on the validation set, but I have no clue which predictions go with which test images in the test set

fastai seems to usually sort things alphabetically, but judging by the y labels, it looks like the output order of predict() is not the the alphabetical order of the validation files (as indexed by val_idxs), nor is it the cvs line order… so I’m lost.

I tried my best to find advice on the forums but so far no luck. Also skimmed through lesson 3, which was supposed to have some tips for making kaggle submissions, but I could find those either. Any tips?

amaraz · June 27, 2018, 9:48pm

You need to construct the submission like this:
submission = pd.DataFrame(dict(name=[s[5:-4] for s in data.test_ds.fnames], prediction=preds[:,1]))