I created a datablock and corresponding dataset. I’m specifically using images and I’m mapping them on a MultiCategoryBlock. I can also check out the vocabulary by using dls.train.vocab. Is there a quick way to see how many images I have per category?
Some (pseudo) code:
import fastbook
fastbook.setup_book()
from fastai.vision.all import *
from fastbook import *
d = {'file': ['file1.jpg', 'file1.jpg', 'file1.jpg'], 'tags': ['a b', 'a', 'b']}
df = pd.DataFrame(data=d)
def get_x(r): return path/r['file']
def get_y(r): return r['tags'].split(' ')
dblock = DataBlock(blocks=(ImageBlock, MultiCategoryBlock),
get_x = get_x, get_y = get_y,
item_tfms = RandomResizedCrop(128, min_scale=0.35))
dls = dblock.dataloaders(df, bs=1)
dls.train.vocab
> ['a', 'b']
#pseudo code:
dls.train.category_len
> {'a': 2, 'b': 2}