Is there a way to trim a dataset inside a data bunch before training?

Thank you, this would help me creating subsets of labels, which would be a big improvement :slight_smile:

I’d still want to reduce the number of samples per label, but I think I have a direction (as provided by this very recent post): ImageDataBunch has a classes array and I have train_ds array that is a tuple (image, class_num), so I theoretically can filter the contents of train_ds. Will report back if I get anywhere.

1 Like