I am trying to use an external dataset for a multicategorical classification.
I have tried various approaches to be able to format it similar to the Pascal dataset used in Lesson 06.
The two most promising steps are:
- using str.cat
However, I get individual characters when I go through the vocabulary:
I have tried various combinations, with and without text preprocessing, using Nan, using str(0), and it still gets that same segmented vocabulary.
- A For loop
However, it stops with a single item. I have tried various combinations including empty lists, appends, ranges, while. I cannot seem to get the right combo to be able to gather all the animals present.
I also tried to directly pass the encoded columns directly as get_y, where a TypeError was generated:
I would appreciate advise as to how to best use data where the multi-label targets are already encoded.