I thought MaskBlock??
would be a good place to look at. It instanciates AddMaskCodes
which sets a vocab
attribute.
class AddMaskCodes(Transform):
"Add the code metadata to a `TensorMask`"
def __init__(self, codes=None):
self.codes = codes
if codes is not None: self.vocab,self.c = codes,len(codes)
…
So we just have to find the place where the dataloader saves that transform.
Having a look at the attributes of learn.dls.train
shows that AddMaskCodes
is in a Pipeline at the after_item
attribute. You can check which attributes an object has with learn.dls.train.__dict__
or vars(learn.dls.train)
.
learn.dls.train.__dict__
>> {'after_item': Pipeline: AddMaskCodes -> ToTensor,
'before_batch': Pipeline: ,
'after_batch': Pipeline: IntToFloatTensor -- {'div': 255.0, 'div_mask': 1} -> Normalize -- {'mean': tensor([[[[0.4850]],
[[0.4560]],
…
}
So by picking the first transform from the pipeline we get what we searched for: learn.dls.train.after_item[0].vocab
.
learn.dls.train.after_item.vocab
works because of… fastcore magic
I got that idea from making a symbolic search for vocab
(Ctrl+t in VSCode)
The first thing that stood out to me was get_c
since that is the function that derives the number of categories from your data/dataloaders to build the models classification layer and thus might also be a good place to look regarding your problem.
def get_c(dls):
if getattr(dls, 'c', False): return dls.c
if nested_attr(dls, 'train.after_item.c', False): return dls.train.after_item.c
…
Since AddMaskCodes
sets c
as well as vocab
you could expect that you can access the vocab in the same way as get_c
accesses the count and it worked .
My response got quiet detailed but maybe someone finds that helpfull… basically you just guess and try a bunch of things and hope that it holds the information that you are searching for. The more you guess the more details you are going to pick up and your guesses become more educated
The code base might seem intimidating in the beginning, but it’s extremely worthwile to spend time in it since it speeds up your debuging process tremendously (that process took me ~1min.).