After Cleaning. ImageDataBunch.from_csv causes Errors

With the code given.

If you already cleaned your data, run this cell instead of the one before

np.random.seed(42)
data = ImageDataBunch.from_csv(".", folder=".", valid_pct=0.2, csv_labels=‘cleaned.csv’,
ds_tfms=get_transforms(), size=224, num_workers=4).normalize(imagenet_stats)

I get this result:
You can deactivate this warning by passing no_check=True.
/opt/conda/envs/fastai/lib/python3.6/site-packages/fastai/basic_data.py:226: UserWarning: There seems to be something wrong with your dataset, can’t access any element of self.train_ds.
Tried: 561,62,711,69,44…
warn(warn_msg)

I can change the ImageDataBunch.from_csv to:
data = ImageDataBunch.from_csv(path, label_delim=’,’, valid_pct=0.2, csv_labels=‘cleaned.csv’,
ds_tfms=get_transforms(), size=224, num_workers=4).normalize(imagenet_stats)

I dont get errors after running. But when I go to Train it always interrupts. I got good results, but after cleaning im stuck…

1 Like

I’m having a similar problem. After cleaning the data when I go back to load the new data from CSV in the line…

np.random.seed(42)
data = ImageDataBunch.from_csv(".", folder=".", valid_pct=0.2, csv_labels=‘cleaned.csv’,
ds_tfms=get_transforms(), size=224, num_workers=4).normalize(imagenet_stats)

But it can’t find the CSV file. It is in the same folder as my classes, data/fruits.


FileNotFoundError Traceback (most recent call last)
in
3 print(path)
4 data = ImageDataBunch.from_csv(".", folder=".", valid_pct=0.2, csv_labels=‘cleaned.csv’,
----> 5 ds_tfms=get_transforms(), size=224, num_workers=4).normalize(imagenet_stats)

/opt/conda/envs/fastai/lib/python3.6/site-packages/fastai/vision/data.py in from_csv(cls, path, folder, label_delim, csv_labels, valid_pct, fn_col, label_col, suffix, header, **kwargs)
136 “Create from a csv file in path/csv_labels.”
137 path = Path(path)
–> 138 df = pd.read_csv(path/csv_labels, header=header)
139 return cls.from_df(path, df, folder=folder, label_delim=label_delim, valid_pct=valid_pct,
140 fn_col=fn_col, label_col=label_col, suffix=suffix, **kwargs)

/opt/conda/envs/fastai/lib/python3.6/site-packages/pandas/io/parsers.py in parser_f(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, escapechar, comment, encoding, dialect, tupleize_cols, error_bad_lines, warn_bad_lines, skipfooter, doublequote, delim_whitespace, low_memory, memory_map, float_precision)
676 skip_blank_lines=skip_blank_lines)
677
–> 678 return _read(filepath_or_buffer, kwds)
679
680 parser_f.name = name

/opt/conda/envs/fastai/lib/python3.6/site-packages/pandas/io/parsers.py in _read(filepath_or_buffer, kwds)
438
439 # Create the parser.
–> 440 parser = TextFileReader(filepath_or_buffer, **kwds)
441
442 if chunksize or iterator:

/opt/conda/envs/fastai/lib/python3.6/site-packages/pandas/io/parsers.py in init(self, f, engine, **kwds)
785 self.options[‘has_index_names’] = kwds[‘has_index_names’]
786
–> 787 self._make_engine(self.engine)
788
789 def close(self):

/opt/conda/envs/fastai/lib/python3.6/site-packages/pandas/io/parsers.py in _make_engine(self, engine)
1012 def _make_engine(self, engine=‘c’):
1013 if engine == ‘c’:
-> 1014 self._engine = CParserWrapper(self.f, **self.options)
1015 else:
1016 if engine == ‘python’:

/opt/conda/envs/fastai/lib/python3.6/site-packages/pandas/io/parsers.py in init(self, src, **kwds)
1706 kwds[‘usecols’] = self.usecols
1707
-> 1708 self._reader = parsers.TextReader(src, **kwds)
1709
1710 passed_names = self.names is None

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader.cinit()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._setup_parser_source()

FileNotFoundError: File b’cleaned.csv’ does not exist

1 Like

Hey,
Try doing it again…
I did the same thing as yesterday again… and its working…

Just look at the Documentation :slight_smile: The pd.read_csv() that fails is because it appends ‘.’ to the csv_labels variable. Just change the first argument from ‘.’ to path and you should be good! :slight_smile:
So , use this :slight_smile:
data = ImageDataBunch.from_csv(path, folder=".", valid_pct=0.2, csv_labels=‘cleaned.csv’,
ds_tfms=get_transforms(), size=224, num_workers=4).normalize(imagenet_stats)

10 Likes

Thanks @danieldsouza!

In case of SyntaxError: invalid character in identifier replace special ticks to normal ticks like this:

data = ImageDataBunch.from_csv(path, folder=".", valid_pct=0.2, csv_labels='cleaned.csv',
ds_tfms=get_transforms(), size=224, num_workers=4).normalize(imagenet_stats)
3 Likes

Hi,
I still experience a problem with method.
My folder structure for the image is (and worked fine in the lesson2 notebook):

  • data
    • gardens
      • barock
        • jpgs
      • english
        • jpgs
      • persian
        • jpgs
      • japan
        • jpgs
      • cleaned.csv

I tried

data = ImageDataBunch.from_csv(path, folder=".", valid_pct=0.2, csv_labels=‘cleaned.csv’,
ds_tfms=get_transforms(), size=224, num_workers=4).normalize(imagenet_stats)

The path to the images is not currectly build. The message at the end of the stack trace is:

FileNotFoundError: [Errno 2] No such file or directory: ‘data/gardens/./japan/00000146.jpg’

=> The correct path to the picture would be ‘data/gardens/./japan/00000146.jpg’

When trying with an empty string ("") for the folder, I get

FileNotFoundError: [Errno 2] No such file or directory: ‘data/gardens//japan/00000146.jpg’

So it seems the method and from_csv (as well as from_df which probably is internally called) is not putting the path currectly togehter.