How to use the vocab parameter in ImageDataLoaders.from_folder

If a vocab is passed, only the folders with names in vocab are kept.

Does anyone know how to use the vocab parameter?
I am looking for some examples and explanation to understand the usage.


Suppose you have ten folders, where one includes pictures of airplanes, another automobiles, another birds, and so on. If you don’t pass in vocab, fastai will assume you have ten labels for the ten folders but you might only be interested in airplanes and automobiles, in which case you can pass vocab = ['airplane', 'automobile'] and fastai will set up your DataLoaders so there are only those two classes.


from import *

path = untar_data(URLs.CIFAR)/'train'

# In this case, dls will have all ten classes of CIFAR
# and there will thus be 10 categories.
dls = ImageDataLoaders.from_folder(path, valid_pct=0.1, seed=42) 

# Here, your only labels are airplane and automobile.
# The rest of the folders are ignored.
dls = ImageDataLoaders.from_folder(path, valid_pct=0.1, seed=42,
                                   vocab=['airplane', 'automobile'])

Hey @BobMcDear,

Thanks for the explanation and the example.
They show the very clear usage of vocab parameter.

By the way, sorry for the late acknowledgment.


1 Like