`get_image_files()` not recognizing images with extension `.jpg!d`

itacdonev · December 8, 2022, 5:34pm

I am completing the computer vision model based on lectures 1&2 where I am trying to classify flowers, i.e. daisy, tulips, and roses.

Once I downloaded the data and verified the images I did a manual check on the number of files remaining in each category. Then I ran the get_image_files(path) where I got different results, namely it didn’t account for 3 images with the extension .jpg!d.

I ran the following code to check for the differences where nif is the list of all images files for all three classes:

# Get the images files using fastai
faif = get_image_files(path)

# Compute the difference
d = list(set(nif)-set(faif))
print('Image files missing in fastai list:')
d

with results:

Image files missing in fastai list:
[Path('images/roses/a2b53eac-4d12-45a4-bfe7-b9387621dfc3.jpg!d'),
 Path('images/roses/18e8f791-4a45-4e4b-9af2-23153a73c855.jpg!d'),
 Path('images/tulips/e49da1d5-7379-4ab1-8526-a08d5feb32c0.jpg!d')]

Plotting the three image files I get the following:

rows, columns = 1,3
fig = plt.figure(figsize=(10, 7))
for i in range(len(d)):
    fig.add_subplot(rows, columns, (i+1))
    im = Image.open(d[i])
    plt.imshow(im.resize((256,256)))
plt.show()

Screen Shot 2022-12-08 at 18.31.33

Is this the intention of get_image_files()?

benkarr · December 8, 2022, 7:15pm

Hey,
If you check the implementation with get_image_files?? you see that it just returns get_files(path, extensions=image_extensions, …). image_extensions is a set of common image extensions which doesn’t include .jpg!d. You can add it with image_extensions.add('.jpg!d'). Doing this should make get_image_files(path) return all of your images.

If you do know all file extensions in your dataset you can also use get_files directly:

faif = get_files(path, extensions=['.jpg','.jpg!d','.png',…])

Hope this helps.