Possible bug ImageClassifierData.from_path using test

Look at In [61], and In[62]. so in the file there are 10358 files versus 10361. Has anybody else seen this? Somehow I am getting 4 extra files through the ImageClassifierData than I am not seeing when I just ls the file location. If anybody has any ideas or can confirm this is a bug I would appreciate it. Until I see a fix I will go look into it myself and assume it’s a bug.

@KevinB Is there any particular reason why you aren’t using ImageClassifierData.from_csv for the dog breed dataset? Since this dataset is not provided in sub-folders by class, it would be much easier for you to use ImageClassifierData.from_csv instead of ImageClassifierData.from_path.

re the bug, I’ve only used from_path with cats vs dogs (lesson 1) and didn’t see the issue you described there. Haven’t tried it with dog breed tho for the reason mentioned above.

That was just what I had started with. I was trying to build the model without referring back to the video. That was the one that seemed to make sense when I went through it. I am going to keep digging to see if I can fix the error, but if I don’t get anywhere I will start back at that point and use from_csv. Thanks for the tip though.

Interestingly the issue actually isn’t with the ImageClassifierData but when it gets fed into a learner. So here is my current status:

The ImageClassifierData gives me 10357 which is what is expected, now if we do a ConvLearner.pretrained and feed that same data object in…

  1. So it is something with the ConvLearner.pretrained function. Still digging, but I don’t think this is an issue with using from_paths. very strange though.

I deleted my tmp and am repulling that information. Hopefully that fixes my issue.

Update: This fixed my issue. Basically it comes down to this. I accidentally moved a few files into my test directory during testing and when I ran the learn=ConvLearner.pretrained(arch, data, precompute=True) command it saved this in the tmp directory. So I had deleted my test folder, but unfortunately I hadn’t deleted my tmp directory so it kept using the test folder that was now deleted, which had 4 extra files. Everything is good now and I am running predictions now.