[Solved Issue] ImageDataBunch.from_folder: No such file or directory

Shubhajit · October 29, 2018, 7:16pm

I have the latest release of fastai viz. 1.0.15 !

zachcaceres · October 30, 2018, 3:29am

I’m having a similar issue. I have a folder tree like this:

data
    class1
    class2
    class3

I called data = ImageDataBunch.from_folder(path, valid_pct=0.3, ds_tfms=get_transforms(), size=224)

I get FileNotFoundError: [Errno 2] No such file or directory: 'compositions/train'

My reading of the vision docs suggested that this would be ok. I thought valid_pct would recursively split all the classes into train/valid folders. But I get the same error No such file or directory:.

At minimum we might consider making it a bit clearer in the docs if several people are making this same error.

jeremy · October 30, 2018, 3:35am

It doesn’t actually move things in to folders. So you need to tell it where the images are. By default they are in ‘train’, which yours aren’t. So you’ll see in my sample notebook I have:

data = ImageDataBunch.from_folder(path, train=".", valid_pct=0.2,
    ds_tfms=get_transforms(), size=224, num_workers=4).normalize(imagenet_stats)

I do agree this is unclear/unexpected. Perhaps one option would be for ‘train’ to default to ‘.’ if ‘valid_pct’ is not zero?

KevinB · October 30, 2018, 3:43am

Or maybe just catch that error and spell it out a bit? Like if it would give you the FileNotFoundError, it instead tells you that by default, the images should be in path/train. If you want to modify that location, change the train parameter to train={desired location of files} Not sure if that would help people or would confuse people.

zachcaceres · October 30, 2018, 3:44am

This would have helped me!

I’m probably just going to script a split from:

data
    class1
    class2

to

data
    train
        class1
        class2
    valid
        class1
        class2

KevinB · October 30, 2018, 3:45am

You should still be able to use it if your data is how you showed above, is your path variable pointing to data?

zachcaceres · October 30, 2018, 3:46am

It is pointing to the parent directory of my big list of dirs (classes)

KevinB · October 30, 2018, 3:47am

Yeah, try adding train="." That tells it to use the current folder.

zachcaceres · October 30, 2018, 3:49am

It still seems to be looking for the valid/train split in the current dir.

No such file or directory: 'compositions/valid/albéniz'

Path points to compositions and albéniz is a child of compositions

KevinB · October 30, 2018, 3:50am

so do you have compositions/data/{class1, class2, etc}?

zachcaceres · October 30, 2018, 3:51am

Nope it looks just like this:

compositions
    albeniz
        img1.png
        img2.png
    bach
        img1.png
        img2.png
    etc..

zachcaceres · October 30, 2018, 3:53am

I already tried cd-ing into compositions and running it from there, as well as cd-ing in and setting path to ‘.’

It consistently looks for valid/folder_name_here, which obviously doesn’t exist. This seems to square with what Jeremy shared above.

I would love for this to work, but perhaps it’s just not designed that way?

KevinB · October 30, 2018, 3:54am

Can you post your current code? I am curious to try recreating the issue

Mostly interested in the ImageDataBunch.from_folder command

zachcaceres · October 30, 2018, 3:56am

Yeah!

path = Path('compositions')
data = ImageDataBunch.from_folder(path, train=".", valid_pct=0.3, ds_tfms=get_transforms(), size=224)

That’s it!

jeremy · October 30, 2018, 3:58am

BTW using from_csv is often easier than putting things in folders. I introduce the folders approach because it’s simpler for people not so familiar with coding - but personally I use CSV files most of the time.

zachcaceres · October 30, 2018, 3:59am

Makes sense. I only did it because of my false assumption that I could generate the valid/train with that method lesson learned!

KevinB · October 30, 2018, 4:14am

Hmmm, I’m not able to recreate the issue. I tried getting as close to your instance as I could. Here is what I have:

So I copied your compositions file name and it is at the same level as the notebook I am running. Inside of that is 4 directories with images inside of them. With this, this works for me:

%reload_ext autoreload
%autoreload 2
%matplotlib inline

from fastai import *
from fastai.vision import *

path = Path("compositions")

data = ImageDataBunch.from_folder(path, train=".", valid_pct=0.3, ds_tfms=get_transforms(), size=224)

One other thing is, can you run show_install(0)

Really hoping we can get this working for you because it is a great way to load images into the ImageDataBunch

One other thing to also check is how many images do you have in albéniz?

zachcaceres · October 30, 2018, 4:27am

Strange!

There are two images in the first folder.

Here’s the install info:

=== Software === 
python version  : 3.6.5
fastai version  : 1.0.12
torch version   : 1.0.0.dev20181022
nvidia driver   : 396.44
torch cuda ver  : 9.2.148
torch cuda is   : available
torch cudnn ver : 7104
torch cudnn is  : enabled

=== Hardware === 
nvidia gpus     : 1
torch available : 1
  - gpu0        : 16280MB | Tesla P100-PCIE-16GB

=== Environment === 
platform        : Linux-4.9.0-8-amd64-x86_64-with-debian-9.5
distro          : #1 SMP Debian 4.9.110-3+deb9u6 (2018-10-08)
conda env       : Unknown
python          : /opt/anaconda3/bin/python
sys.path        : 
/opt/anaconda3/lib/python36.zip
/opt/anaconda3/lib/python3.6
/opt/anaconda3/lib/python3.6/lib-dynload
/opt/anaconda3/lib/python3.6/site-packages
/opt/anaconda3/lib/python3.6/site-packages/IPython/extensions
/home/jupyter/.ipython

KevinB · October 30, 2018, 4:35am

You are on a bit older version of fastai than I am. Maybe there was a bug that was fixed? I am on fastai version : 1.0.15

zachcaceres · October 31, 2018, 2:06am

This fixed the problem. Thanks so much @KevinB !