datasetFormatter.from_toplosses returning all images instead of images of top_losses

SkyeRig · September 29, 2019, 6:51pm

I am using the image cleaner widget to look at the photos that are being mislabeled. When I try to use datasetFormatter().from_toplosses() it returns all the images in my valid and train set instead of only the ones from toplosses.

db = (ImageList.from_folder(path)
                   .split_none()
                   .label_from_folder()
                   .transform(get_transforms(), size=224)
                   .databunch())

data_bunch = ImageList.from_folder(path)
learn_cln = cnn_learner(db, models.resnet34, metrics=error_rate)

learn_cln.load('stage-2');

ds, idxs = DatasetFormatter().from_toplosses(learn_cln)
len(ds)
>>> 548 # the len(ds) is the totoal number of images in my train and validation
 sets. shouldn't it be less?

EnHakore · February 26, 2020, 5:58am

@SkyeRig I have same problem…how did you resolve it?

EnHakore · March 4, 2020, 6:14am

# interp is classification interpretation object

interp = ClassificationInterpretation.from_learner(learn)

losses,idxs = interp.top_losses(k=200)

So k=200 will return top 200 losses.