Filter at epoch begin

I am trying to implement Curriculum Learning as a callback.

My current implementation makes a new ImageList -> DataBunch every time, with a filter_by_func to keep only the examples of the correct difficulty level. This works but seems wasteful and inelegant because I am running

    image_list = ImageList.from_folder(path)
    image_list = ClassUtils.filter_classes(image_list, classes, woof) = (image_list

            .label_from_folder().transform(([flip_lr(p=flip_lr_p)], []), size=size)
            .databunch(bs=bs, num_workers=workers, shuffle_train=shuffle_train)
            .presize(size, scale=(0.35,1))

every epoch.

Unfortunately, when I try to filter_by_func the resulting DataBunch, later in the process, I get an error that xb is meant to be a Tensor, not an Image.

Is there a better way to filter at the beginning of every epoch? Can a DeviceDataLoader be filtered? (Potentially relatedly, what is CallbackHandler.set_dl used for?


I think what you want is called a sampler (from PyTorch, not fastai), to only loop through certain elements of your dataset. You can pass one after creating your data with:

data.train_dl = data>, sampler=my_sampler)