How to set custom vocab for MultiCategoryBlock

Issue:

Running this code gives the vocab in the wrong format

Instead, I need the vocab to look like this

Attempt

Error

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-58-27c061c32e60> in <module>
----> 1 data = DataBlock(blocks=(ImageBlock, MultiCategoryBlock(vocab=CategoryMap([str(i) for i in range(24)], sort=False))),
      2                    splitter=RandomSplitter(valid_pct=0.2, seed=42),
      3                    get_x=ColReader('recording_id'),
      4                    get_y=ColReader('species_id', label_delim=' '),
      5                  )

/opt/conda/lib/python3.7/site-packages/fastai/data/block.py in MultiCategoryBlock(encoded, vocab, add_na)
     27 def MultiCategoryBlock(encoded=False, vocab=None, add_na=False):
     28     "`TransformBlock` for multi-label categorical targets"
---> 29     tfm = EncodedMultiCategorize(vocab=vocab) if encoded else [MultiCategorize(vocab=vocab, add_na=add_na), OneHotEncode]
     30     return TransformBlock(type_tfms=tfm)
     31 

/opt/conda/lib/python3.7/site-packages/fastcore/transform.py in __call__(cls, *args, **kwargs)
     37             getattr(cls,n).add(f)
     38             return f
---> 39         return super().__call__(*args, **kwargs)
     40 
     41     @classmethod

/opt/conda/lib/python3.7/site-packages/fastai/data/transforms.py in __init__(self, vocab, add_na)
    253     "Reversible transform of multi-category strings to `vocab` id"
    254     loss_func,order=BCEWithLogitsLossFlat(),1
--> 255     def __init__(self, vocab=None, add_na=False): super().__init__(vocab=vocab,add_na=add_na,sort=vocab==None)
    256 
    257     def setups(self, dsets):

/opt/conda/lib/python3.7/site-packages/fastai/data/transforms.py in __eq__(self, b)
    225         return L(self.items[o] for o in ids)
    226 
--> 227     def __eq__(self,b): return all_equal(b,self)
    228 
    229 # Cell

/opt/conda/lib/python3.7/site-packages/fastai/imports.py in all_equal(a, b)
     62     "Compares whether `a` and `b` are the same length and have the same contents"
     63     if not is_iter(b): return False
---> 64     return all(equals(a_,b_) for a_,b_ in itertools.zip_longest(a,b))
     65 
     66 def noop (x=None, *args, **kwargs):

TypeError: zip_longest argument #1 must support iteration

Question:

In general, what is the proper way to specify custom vocab ?

Just pass in vocab = ['0', '1', '2', '3'...] in your MultiCategoryBlock

Shall we also give a “c” parameter to OneHotEncode?