Help with understanding vocab

Hello, I’m using a plain pytorch Dataset with fastai2, training is ok but when I try to use ClassificationInterpretation it complains the Dataset doesn’t have a vocab defined, I am having a hard time understanding how to define it properly. I thought a simple vocab = ['a','b'] on the Dataset class would be enough but the ClassificationInterpretation constructor only keeps the last element of the vocab list:

bump, can anybody help ?

I read the mid-level api tutorial: Tutorial - Assemble the data on the pets dataset | fastai, but there it also defines self.vocab as a simple list of labels, this won’t work with ClassificationInterpretation, which only keeps the last element of vocab.

hm that could be a bug? I think the idea behind the code is to support text vocabs - they look like this:

dls.vocab[[<text vocab>], [<classes>]]

# dls.vocab[-1] will make sure to use the actual classes instead the text vocab

thanks Florian,
I tried setting the vocab to [[], [‘a’,‘b’]], and plot_confusion_matrix() works, but plot_top_losses() gives me an error still.

but I think I figured out how this works. Categorize(vocab=[‘a’,‘b’]).vocab, generates a CategoryMap, it looks just like a list [‘a’,‘b’] when printed but its a different class, so I guess this makes a big difference for the fastai internals.

using that as the dataset vocab somehow makes it all work as expected !!

1 Like