Correct way to acquire a dataset for Single-Labeled data


I have a dataset labeled with a pandas dataframe, as follows:


The dataset is single-labeled. I’m acquiring the dataset in the following way:

See what happens:


It seems the dataset was automatically acquired as multi-labeled.

How can I acquire it as single-labeled? If possible, I prefer not to change the DataFrame structure.


You can’t if the labels are one-hot encoded, the data block API doesn’t support it.

1 Like

I see.

There will be any chance that this will be supported in the near future? If not:

  • To address the problem, I wrote a bit of code that reads the file name, the label, and copies the image in a subfolder. Then I acquire them with from_folder. It is quick and practical, but not elegant at all (and wastes disk space). Can you think of any other method to acquire them with the data block api?

  • If I’ll manage to understand the db API internals, I’ll try and write something by myself. in that case, I’ll open a PR.