[Solved Issue] ImageDataBunch.from_folder: No such file or directory

I have the latest release of fastai viz. 1.0.15 !

Iā€™m having a similar issue. I have a folder tree like this:

data
    class1
    class2
    class3

I called data = ImageDataBunch.from_folder(path, valid_pct=0.3, ds_tfms=get_transforms(), size=224)

I get FileNotFoundError: [Errno 2] No such file or directory: 'compositions/train'

My reading of the vision docs suggested that this would be ok. I thought valid_pct would recursively split all the classes into train/valid folders. But I get the same error No such file or directory:.

At minimum we might consider making it a bit clearer in the docs if several people are making this same error.

It doesnā€™t actually move things in to folders. So you need to tell it where the images are. By default they are in ā€˜trainā€™, which yours arenā€™t. So youā€™ll see in my sample notebook I have:

data = ImageDataBunch.from_folder(path, train=".", valid_pct=0.2,
    ds_tfms=get_transforms(), size=224, num_workers=4).normalize(imagenet_stats)

I do agree this is unclear/unexpected. Perhaps one option would be for ā€˜trainā€™ to default to ā€˜.ā€™ if ā€˜valid_pctā€™ is not zero?

2 Likes

Or maybe just catch that error and spell it out a bit? Like if it would give you the FileNotFoundError, it instead tells you that by default, the images should be in path/train. If you want to modify that location, change the train parameter to train={desired location of files} Not sure if that would help people or would confuse people.

1 Like

This would have helped me!

Iā€™m probably just going to script a split from:

data
    class1
    class2

to

data
    train
        class1
        class2
    valid
        class1
        class2

You should still be able to use it if your data is how you showed above, is your path variable pointing to data?

1 Like

It is pointing to the parent directory of my big list of dirs (classes)

Yeah, try adding train="." That tells it to use the current folder.

1 Like

It still seems to be looking for the valid/train split in the current dir.

No such file or directory: 'compositions/valid/albeĢniz'

Path points to compositions and albƩniz is a child of compositions

so do you have compositions/data/{class1, class2, etc}?

1 Like

Nope it looks just like this:

compositions
    albeniz
        img1.png
        img2.png
    bach
        img1.png
        img2.png
    etc..

I already tried cd-ing into compositions and running it from there, as well as cd-ing in and setting path to ā€˜.ā€™

It consistently looks for valid/folder_name_here, which obviously doesnā€™t exist. This seems to square with what Jeremy shared above.

I would love for this to work, but perhaps itā€™s just not designed that way?

Can you post your current code? I am curious to try recreating the issue

Mostly interested in the ImageDataBunch.from_folder command

1 Like

Yeah!

path = Path('compositions')
data = ImageDataBunch.from_folder(path, train=".", valid_pct=0.3, ds_tfms=get_transforms(), size=224)

Thatā€™s it!

BTW using from_csv is often easier than putting things in folders. I introduce the folders approach because itā€™s simpler for people not so familiar with coding - but personally I use CSV files most of the time.

3 Likes

Makes sense. I only did it because of my false assumption that I could generate the valid/train with that method :rofl: lesson learned!

Hmmm, Iā€™m not able to recreate the issue. I tried getting as close to your instance as I could. Here is what I have:

So I copied your compositions file name and it is at the same level as the notebook I am running. Inside of that is 4 directories with images inside of them. With this, this works for me:

%reload_ext autoreload
%autoreload 2
%matplotlib inline

from fastai import *
from fastai.vision import *

path = Path("compositions")

data = ImageDataBunch.from_folder(path, train=".", valid_pct=0.3, ds_tfms=get_transforms(), size=224)

One other thing is, can you run show_install(0)

Really hoping we can get this working for you because it is a great way to load images into the ImageDataBunch :slight_smile:

One other thing to also check is how many images do you have in albeĢniz?

1 Like

Strange!

There are two images in the first folder.

Hereā€™s the install info:

=== Software === 
python version  : 3.6.5
fastai version  : 1.0.12
torch version   : 1.0.0.dev20181022
nvidia driver   : 396.44
torch cuda ver  : 9.2.148
torch cuda is   : available
torch cudnn ver : 7104
torch cudnn is  : enabled

=== Hardware === 
nvidia gpus     : 1
torch available : 1
  - gpu0        : 16280MB | Tesla P100-PCIE-16GB

=== Environment === 
platform        : Linux-4.9.0-8-amd64-x86_64-with-debian-9.5
distro          : #1 SMP Debian 4.9.110-3+deb9u6 (2018-10-08)
conda env       : Unknown
python          : /opt/anaconda3/bin/python
sys.path        : 
/opt/anaconda3/lib/python36.zip
/opt/anaconda3/lib/python3.6
/opt/anaconda3/lib/python3.6/lib-dynload
/opt/anaconda3/lib/python3.6/site-packages
/opt/anaconda3/lib/python3.6/site-packages/IPython/extensions
/home/jupyter/.ipython

You are on a bit older version of fastai than I am. Maybe there was a bug that was fixed? I am on fastai version : 1.0.15

1 Like

This fixed the problem. Thanks so much @KevinB !