BUG - Data classes with Spaces

Hi

I am trying to build a model for the plant seedlings classification competition.
I am using the dog breeds template, I created a csv from the folder names.
However some species have a space in the name and it seems that ImageClassifierData.from_csv is not working properly (or that I am doing something wrong - most likely).


Thanks!
Ale

Easiest way would be to replace all the class names with a space with an under score. This is easily done using pandas I believe (iirc it is apply function) and save those csv files again.

Yep ok so it is a feature or a bug of the function ImageClassifierData.from_csv

Spaces denote multiple classes for a file for multi-classification so that’s why “Common Chickweed” becomes separate classes Common and Chickweed. I haven’t tried but perhaps using quotes might work.

ImageClassifierData.from_csv() invoke parse_csv_labels(fn, skip_header=True, cat_separator = ' ') function internally.

  • fn: Path to a CSV file.
  • cat_separator: the separator for the categories column

parse_csv_labelsis a function part of fastai lib. It parses filenames and label sets from a CSV file. The labels in the label set are expected to be space separated. This function creates a Pandas DataFrame from the CSV file and then split the category column using the cat_separator to become the class labels.

Unfortunately, the current ImageClassifierData API, there is no way to pass in a different cat_separator then the default one (space separator).

Ok I understand now thanks all!