I am trying to train a model for Devnagari character classfication, on line #9, it gives the error
Traceback (most recent call last):
File "Model.py", line 9, in <module>
data = ImageDataBunch.from_folder(path, train = 'train', valid = 'valid', ds_tfms=tfms, size=32)
File "/usr/lib/python3.8/site-packages/fastai/vision/data.py", line 108, in from_folder
if valid_pct is None: src = il.split_by_folder(train=train, valid=valid)
File "/usr/lib/python3.8/site-packages/fastai/data_block.py", line 212, in split_by_folder
return self.split_by_idxs(self._get_by_folder(train), self._get_by_folder(valid))
File "/usr/lib/python3.8/site-packages/fastai/data_block.py", line 207, in _get_by_folder
return [i for i in range_of(self) if (self.items[i].parts[self.num_parts] if isinstance(self.items[i], Path)
File "/usr/lib/python3.8/site-packages/fastai/data_block.py", line 207, in <listcomp>
return [i for i in range_of(self) if (self.items[i].parts[self.num_parts] if isinstance(self.items[i], Path)
IndexError: index 0 is out of bounds for axis 0 with size 0
My code is like this:
from fastai import *
from pathlib import Path
from fastai.vision import *
from fastai.metrics import error_rate
bs = 16
path = Path('/home/apostrophie/Documents/DevanagariChars/DevanagariHandwrittenCharacterDataset')
tfms = get_transforms(do_flip=False)
data = ImageDataBunch.from_folder(path, ds_tfms=tfms, size=32)
The dataset structure is like this:
This is my first time to train using a folder and I cannot find a solution even after trying for multiple hours.
Any help would be really appreciated.
Edit:
After some researching, I also tried using this, but to no avail:
data = (ImageList.from_folder(path)
.split_by_folder()
.label_from_folder()
.transform(tfms, size=32)
.databunch())
data = ImageDataBunch.from_df(path, df, ds_tfms=tfms, size=32)
Try to use this if you have the validation folder, which is named âvalidâ.
how ImageDataBunch.from_fold aggregate the data? does it join all the data (train, dev and test) together and then go on to do a split using the default 0.2 for train/dev?
ImageDataBunch.from_folder is a standard imagenet style. For example, in your path folder:
path = Path(âyour_computer/Number_folderâ)
Inside Number_ folder
Number_folder
------------------Number_1
--------------------------1(1).jpg 1(2).jpg
------------------Number_2
-------------------------2(1).jpg 2(2).jpg
------------------Number_3
-------------------------3(1).jpg 3(2).jpg
For this, you donât need the valid folder if you use âsplit by randomâ.
This was taught on lesson2 notebook.
1 Like
@JonathanSum I have a validation folder with the name âvalidâ, yet I cannot get it to work.
The df must include all the image paths with their labels, right?
I am also wondering what is causing that IndexError in my code.


Do you have these structures in your folder?
If you still have error, I think your path variable to not leading to the folder above the train folder.
Try the fastai part1 lesson notebook, one of it use train vs vaild for training.
I would suggest first trying out this command
data = ImageDataBunch.from_folder(path, ds_tfms=tfms, size=32, valid_pct = x)
Set x to whatever validation set split you want. Typically itâs 0.25.
Check if this works then we can try using the valid folder. What the above command does is it picks up the train folder and creates a random validation split from that train set itself.
Also the error message makes it clearer:
File "/usr/lib/python3.8/site-packages/fastai/vision/data.py", line 108, in from_folder
***if valid_pct is None: src = il.split_by_folder(train=train, valid=valid)***
You have not specified a valid_pct and there is no valid folder in your data structure. That is causing the error
It still gives the error with valid_pct = 0.25, so I created a âvalidâ folder. My dataset looks like the above.
Okay and you are still getting an error with the above folder structure?