I have been trying to construct a list of pairs of filenames and the corresponding ground truths. Here’s how I tried to do this:
train_preds, train_ground_truths = learn.get_preds(ds_type=DatasetType.Train)
train_paths = list(map(lambda path: str(path), learn.data.valid_ds.items))
train_fnames = list(map(lambda path: path.split('/')[:-1], train_paths))
train_fnames_and_ground_truths = list(zip(train_fnames, train_ground_truths))
However, I did a few manual checks and it seems that the two arrays I am zipping together here are completely misaligned… How can I do this?
There’s a similar thread here,
Get the filenames of the data in the Test set in the order they're predicted, but there’s no definite, practical answer there either.
Hmm, I guess someone needs to submit a pull request to update fast.ai docs+library to make this easier but check out my previous posts:
How to review images for bad classifications + How do I get my list of predictions match the order of the images in my test folder?
It’s not a copy&paste solution but I think these posts should provide you with all the ingredients. I guess the main point for you is that you the need the
ds_idx instead of of just iterating of
Go look at my new PR
Here is how I extrapolated:
for j, idx in enumerate(self.tl_idx):
da, cl = self.interp.data.dl(self.interp.ds_type).dataset[idx]
img, lbl = self.interp.data.valid_ds[idx]
fn = self.interp.data.valid_ds.x.items[idx]
fn = re.search('([^/*]+)_\d+.*$', str(fn)).group(0)
x += 1
@haverstind and @muellerzr, I’ll check out both
Could you share a link to your PR? I couldn’t find it
@haverstind, I checked both of the linked posts, but it seems to me that there what you did was filtering a subset of the predictions so that’s why you needed to keep track of the indices. If no filtering is done in your loops, then i == ds_idx for all entries, and so it’s not really doing anything, I just tried to follow those snippets, but I am still getting inconsistent columns:
Could you please spell out what you had in mind, maybe I misunderstood?
Hey everyone! Have you ever wanted to further explore where your model was being confused? Look and take a peek at which specific images were being confused for x vs y class? Or look at the distributions of your mislabeled classes for your tabular models? ClassConfusion will do exactly this!
For both images and tabular, we pass in our ClassificationInterpretation object, along with a list of classes, and we can specify if we want an ordered set of classes or just the two back to back. For insta…
That’s the pr. The code I showed is how I got the filenames associated with when two classes matched. You should be able to extrapolate it from there. Source code link should be there too for my github as well.
in the code, fn is file name. You can see where it lives in items.
I’m using 1.0.44 os I don’t have the class that you are using there. But surely it can’t be that hard zipping together preds and fnames? This such a frustrating thing it should be a no brainer…
I show it with img and fname, give me a few minutes and I’ll show a better example
This isn’t quite “zipping” as I don’t know how to do that in python yet, but the source code for that function is here:
Look under _plot_imgs.
Though it may not be the best choice looking back as we assume losses. Is your data labeled? Or are we doing this on test sets
This is what I was looking for.