Path.ls ++

I’ve been using an enhanced version of Path.ls functionality implemented in fastai, that has been very usefull in my work so I tought it may worth sharing.

def filter_files(files, include=[], exclude=[]):
    for incl in include:
        files = [f for f in files if incl in f.name]
    for excl in exclude:
        files = [f for f in files if excl not in f.name]
    return sorted(files)

def ls(x, recursive=False, include=[], exclude=[]):
    if not recursive:
        out = list(x.iterdir())
    else:
        out = [o for o in x.glob('**/*')]
    out = filter_files(out, include=include, exclude=exclude)
    return out

Path.ls = ls

It allows to list files in all subdirectories and also apply filters, for example:

path = Path('data')

# List files, including subdirectories, with .tif in the name but excluding .tif.xml:
path.ls(recursive=True, include=['.tif'], exclude=['.xml'])
8 Likes

this is good. also printing size is a good enhancement. hope you can raise a PR and merge back.

For the recursive path, you should look at our get_files function and os.walk as it’s way faster than the glob ‘**/*’ if you have a huge dataset (like ImageNet).

3 Likes

Thanks, I will take a look at it!