Hello - I’m bit confused by the intended usage of filter_by_folder. In the screenshot, shouldn’t filter_by_folder exclude half the data, like the filter_by_func test below it?
Thanks for the clarification!
Hello - I’m bit confused by the intended usage of filter_by_folder. In the screenshot, shouldn’t filter_by_folder exclude half the data, like the filter_by_func test below it?
Thanks for the clarification!
It only works for folders directly after path, not subfolders.
This is not directly answering your question but I noticed you are not trying to do the same things on both parts. Your filter_by_folder
one aims to exclude the 3s, while the other one will keep them.
With that said, filter_by_folder
works with relative paths of direct subfolders only, as it is implemented like this:
def filter_by_folder(self, include=None, exclude=None):
"Only keep filenames in `include` folder or reject the ones in `exclude`."
include,exclude = listify(include),listify(exclude)
def _inner(o):
if isinstance(o, Path): n = o.relative_to(self.path).parts[0]
else: n = o.split(os.path.sep)[len(str(self.path).split(os.path.sep))]
if include and not n in include: return False
if exclude and n in exclude: return False
return True
return self.filter_by_func(_inner)
So basically, if you pass path/'train/3
and you try on a sample o=path/'train/3/thing.jpg'
, you’ll have:
self.path = Path('/path/to/your/folder/train')
include = []
exclude = [Path('/path/to/your/folder/train/3')]
o.relative_to(self.path).parts = ['3', 'thing.jpg']
n = '3'
n in exclude = False
So it will return True
and keep it. I detailed this so you have a grasp of what it does.
tl;dr : Just call ImageList.from_folder(path/'train').filter_by_folder(exclude='3')
EDIT: damn I’m late
Thanks for the detailed response @florobax. This helps.