Hi! I’ve been trying to read three ordered symbols from an image using fastai, but I can’t figure it out.
The best I’ve managed to do is to create one model for each of the three positions, such that model 1 reads the symbol at position 1, model 2 reads the symbol at position 2 and so on. Then I’ll just run all three and combine the result. This works, but I would like to do it using just one model.
After a lot of searching I’ve stumbled upon one-hot encoding, but I’ve yet to find some examples for fastaiv2. It seems like I can’t use an numpy array or a list(the one-hot encoded label) with eg. ImageDataLoaders.from_name_func as I get the “unhashable type” error.
I’ve learned that the normal multi-label classification uses one-hot encoding, but as some symbols are rare in some positions I sometimes hit the “Labels ‘position1:half-star’ were not included in the training dataset” error. Any way to define what the one-hot encoding should look like, and also make it always use the three most likely signs?
Here are some examples of data using emojis, representing symbols:
The order matters, and there might be duplicates.
Can anyone show me an example or point me in the right direction for how to use one-hot encoded labels? And maybe some pointers on how to calculate the loss?
I would really appreciate if someone could help as I’ve spent quite a few days without getting any closer to a solution.