Epochs of arbitrary length

May you provide an example? I am trying to use partial_dataloaders with images but I am getting
TypeError: default_collate: batch must contain tensors, numpy arrays, numbers, dicts or lists; found <class 'fastai2.vision.core.PILImage'>

My code:

data_block = DataBlock(
    blocks=(
        ImageBlock,
        CategoryBlock
    ),
    n_inp=1,
    get_items=get_image_files,
    get_y=parent_label,
    splitter=RandomSplitter(),
    item_tfms=[Resize(self.input_size), ToTensor],
    batch_tfms=aug_transforms(max_warp=0)
)
tdls = data_block.datasets(self.data_dir).partial_dataloaders(bs=10, partial_n=30)

Does it work when you use regular DataLoaders?

I tried it with

splits = RandomSplitter(valid_pct=0.01)(range_of(df_train))

to = TabularPandas(
    df_train,
    y_names="target",
    cat_names = cat_names,
    cont_names = cont_names,
    procs = [Categorify, FillMissing, Normalize],
    splits=splits
)

# and convert it do dataloader with batch size of 64
batch_size = 64
dls = to.partial_dataloaders(bs=10, partial_n=30)

did not work. During learner fitting I got : KeyError: 131235

I like the idea and it could be a kwarg for the dataloader() method.

The new (v2) API for dataloaders has a n argument to limit the size to n samples; see Data core | fastai

If I understood correctly the implementation, it selects a different set each time.
It works on DataBlock and indeed it uses only a subset of the dataset during training. Code example:

dls = dataset.dataloaders(path, bs=16, n=800)
1 Like