How to use ImageDataBunch.from_folder?


I have generated and uploaded 2 x 180 .png images (90 x 130 RGB pixels) for my 2-class classifier. Each image file has a name like ‘img_nnn.png’ where nnn is a counter. Starting from a parent_folder, I have created two sub-folders, each contains 180 image examples per class.

I have tried ImageDataBunch.from_folder(Path(parent_folder), valid_pct=0.3, size=224, bs=bs). When I run the cell, the Kernel stayed busy for 30 minutes without an error or warning message. I finally interrupted it.

How do I use .from_folder properly?

Any help is appreciated.

Thanks, Marius

I don’t see anything wrong with your usage of from_folder. Unless it is unable to resolve your parent_folder to create the right path object - which I can’t confirm since I don’t know how you initialised it!

I was able to test this out real quick and the data bunch line executed in a few seconds:

Hope this helps!

Thanks a lot. I will try again with only a few images and report back. Every cell of the notebook ran extremely sluggishly when I tried it at 7 AM in Switzerland (GMT+2). I wonder where my Salamander server is hosted. Perhaps it was in some sort of maintenance mode.

1 Like

Hi Nalini.

I created the image subdirectories from within Jupyter. This silently co-created a hidden subdirectory which apparently confuses ImageDataBunch. After removing it using from the terminal I was able to see and train the data.

Lesson learned: I need to get familiar with UNIX/POSIX/LINUX commands.

Thanks for your help

Hi Marius,

That doesn’t sound right! You should be able to create directories using mkdir on a Path object from within Jupyter without ending up with hidden subdirs. You can also use ! mark to execute terminal commands from within Jupyter.

Perhaps I’m missing some information here, but any case, glad your issue is resolved!