Can someone comment on how the ImageDataBunch objects splits the dataset into train-val-test i.e implementation wise? what kind of validation is used?

val split is by default 20% random split of train. See source code below.

def from_lists(cls, path:PathOrStr, fnames:FilePathList, labels:Collection[str], valid_pct:int=0.2, test:str=None, **kwargs):
        classes = uniqueify(labels)
        train,valid = random_split(valid_pct, fnames, labels)
        datasets = [ImageClassificationDataset(*train, classes),
                    ImageClassificationDataset(*valid, classes)]
        if test: datasets.append(ImageClassificationDataset.from_single_folder(Path(path)/test, classes=classes))
        return cls.create(*datasets, path=path, **kwargs)

When calling get_transforms(), why is the xtra_tfms kwarg specified to be a float?
I would have thought it would be some kind of list of operations like [ contrast, crop_pad,…]
The documentation further down the page suggests this:

  • xtra_tfms : a list of additional transforms you would like to be applied

…but that’s not a float. ?


Where do you see that? The code has:


Apparently the code was changed since I posted that question. Commit 32633d9 is where the change was made:

Thanks for your response! Question resolved.

Ah thanks for the explanation - I thought I was going crazy…


I am looking for example code, ideally fast.ai, where language models are used for neural language translations. From last years class I have a notebook using word embeddings, but I wonder if (think it should be possible?) and where to find code for language translation with a language model.

Maybe that is sth that comes up in lesson 7 or fast.ai part 2?

Thx for pointing me in the right direction !