Random tidbit, came across this while doing some study: if you’re trying to data clean a vision model (lesson 1, 2), following along with the ch-2 production notebook: using
fns = get_image_files(path)
won’t reveal non-image files to your verify_images function later. I haven’t noticed if this affects the learner (I think the dataloaders know to ignore non-images anyway), but if you make a function to convert images to JPEG this’ll cause errors. Instead using:
fns = get_files(path)
will expose everything in path.
This is useful because PIL will complain when your dataloader comes across non-JPEG images (RGBA instead of RGB) when you do your first training. A conversion function:
for fpath in subdir.iterdir():
if fpath.suffix != '.jpg':
im = Image.open(fpath)
im = im.convert('RGB')
im.save(str(fpath.parent/fpath.stem)+'.jpg')
fpath.unlink() // delete original file
will remove this warning later on.