Note: 2nd level headings are for modules, 3rd level headings are for functions/classes.
get_files returns an
L list of all the non-hidden files in
path with optional
recurse, only if an optional
include directory is in the path."
source = untar_data(URLs.MNIST_TINY) all_files = get_files(source) train_files = get_files(source, folders='train') valid_img_files = get_files(source, folders='valid', extensions='.png') labels = get_files(source, recurse=False)
Creates and returns a partial
get_files function that searches path suffix
suf and passes along args.
source = untar_data(URLs.MNIST_TINY) get_train = FileGetter(suf='train') get_valid = FileGetter(suf='valid') train_files = get_train(source) valid_imgs = get_valid(source, extensions='.png')
L list of all possible image files in the
path recursively, only if an optional
include directory is in the path.
source = untar_data(URLs.MNIST_TINY) train_imgs = get_image_files(source, folders='train')
RandomSplitter is used for splitting the dataset into train and validation datasets. It creates 2 sets of shuffled indexes, one for train and another for valid.
RandomSplitter returns a function which takes a list of objects(ex: filenames). Let’s say the length of the list is 1000 and we need 20% of it as a validation dataset, it returns a list consisting of shuffled indexes for the train(800 indexes) and valid (200 indexes) set.
source = untar_data(URLs.PETS)/"images" items = get_image_files(source)[:1000] split_idx = RandomSplitter(valid_pct=0.2)(items) len(split_idx),len(split_idx),len(split_idx) Output: (2, 800, 200)
Categorize helps in converting label strings to vocab id and vice versa.
tcat = Categorize(vocab=['cat','dog']) lbl = tcat('cat'); lbl Output : 1 #For reversing/decoding tcat.decode(1) Output : 'cat'
This uses metaclass _TfmMeta. The class has two functions - encodes and decodes. Whenever you index (index is more a functional call like () than [ ]) into the class using an index number the encodes function is automatically called via the _call() method defined in the class. The class.decodes() will have to be called explicitly. It is usually called via the decode() defined in the class.
In a pipeline where a list of transformations are called
pipe = Pipeline([f2,f3,f1])
the pipe.decode() calls the decode() on each of the transformations. In this case, it is f1.decode(), f2.decode() etc. f1.decode() calls f1.decodes() internally.
This is a subclass
Transform. This returns as_item_force = False which allows the Transform to return the result of encodes function as tuple type. This allows an encodes to selectively apply the encodes to an item within the tuple that matches a criteria.