Corrupt images crashing - is there any way to quickly find and delete them?

In a perfect world, all of our images can be read correctly by the library. However, some of the images in my dataset are corrupt and causes to crash and give up when trying to load them into, say, a DataLoader.

Right now, I’m using open_image() and catching exceptions to find corrupt image files and deleting them before loading them for training but it’s turning out to be VERY slow.

Is there any faster way to do this? Is there any built-in function (esp in v1) that’ll detect/delete/skip corrupt images? If not, is there any way to read an image’s headers to quickly tell if it’s corrupt and delete the file so that when you get to loading it in, it won’t crash the pipeline?


There’s a verify_images function in I think - it was used in lesson 2 of the new version of the course.