I have a dataset labeled with a pandas dataframe, as follows:
The dataset is single-labeled. I’m acquiring the dataset in the following way:
See what happens:
It seems the dataset was automatically acquired as multi-labeled.
How can I acquire it as single-labeled? If possible, I prefer not to change the DataFrame structure.
You can’t if the labels are one-hot encoded, the data block API doesn’t support it.
There will be any chance that this will be supported in the near future? If not:
To address the problem, I wrote a bit of code that reads the file name, the label, and copies the image in a subfolder. Then I acquire them with
from_folder. It is quick and practical, but not elegant at all (and wastes disk space). Can you think of any other method to acquire them with the data block api?
If I’ll manage to understand the db API internals, I’ll try and write something by myself. in that case, I’ll open a PR.