Count images in an ImageDataBunch

Hey I hope I’m just missing out on something here. I want to run lessons 1 dataset, but the images went through a costum filter I made. I have all of the images in a folder called ‘Data’ and their names are formatted like this : “Abyssinian_23.png”.
I load them like this:

pat = ‘/([^/]+)_\d+.png$’
data_filtered = ImageDataBunch.from_name_re(path_img, fnames, pat, ds_tfms=get_transforms(), size=224, bs=bs)

Now when I run the model and look at the confusion matrix - I sum the array for the matrix and get 295. When I do the same process with untaring the data for the original set I get 1478.
Does anyone have some idea on how to solve this?

The confusion matrix (and ClassificationInterp) work with your validation set only by default.


So why does the validation set size is 1478 in Lesson 1 example and 295 for me? where can I control is?

Are you 100% certain you didn’t miss any images when moving them over, and your number of classes look correct?

Yes I counted the images in the folder they sum to 1478
I am suspectin the regular expression but I am not sure…

I am now trying to put each class in it’s own folder and try the create image databunch from folder, Is there a way to do so without the ‘train’‘test’‘validation’ folders? But still creating a validation set still?

You can do a split_by_rand_pct to do so. See the lesson 2 notebook. That’s how that data is set up