Filter_by_folder

Hello - I’m bit confused by the intended usage of filter_by_folder. In the screenshot, shouldn’t filter_by_folder exclude half the data, like the filter_by_func test below it?

Thanks for the clarification!

It only works for folders directly after path, not subfolders.

This is not directly answering your question but I noticed you are not trying to do the same things on both parts. Your filter_by_folder one aims to exclude the 3s, while the other one will keep them.
With that said, filter_by_folder works with relative paths of direct subfolders only, as it is implemented like this:

    def filter_by_folder(self, include=None, exclude=None):
        "Only keep filenames in `include` folder or reject the ones in `exclude`."
        include,exclude = listify(include),listify(exclude)
        def _inner(o):
            if isinstance(o, Path): n = o.relative_to(self.path).parts[0]
            else: n = o.split(os.path.sep)[len(str(self.path).split(os.path.sep))]
            if include and not n in include: return False
            if exclude and     n in exclude: return False
            return True
        return self.filter_by_func(_inner)

So basically, if you pass path/'train/3 and you try on a sample o=path/'train/3/thing.jpg', you’ll have:

self.path = Path('/path/to/your/folder/train')
include = []
exclude = [Path('/path/to/your/folder/train/3')]
o.relative_to(self.path).parts = ['3', 'thing.jpg']
n = '3'
n in exclude = False

So it will return True and keep it. I detailed this so you have a grasp of what it does.

tl;dr : Just call ImageList.from_folder(path/'train').filter_by_folder(exclude='3')

EDIT: damn I’m late

Thanks for the detailed response @florobax. This helps.

1 Like