How to get datasets from DataLoaders

Having a code like this:

dls = ImageDataLoaders.from_df(
    df=train_df,
    fn_col=0, label_col=1,
    path=paths['competition'],
    folder='train',
    suff='.tif',
    valid_pct=0.2,
    seed=CONFIG.seed,
    #items_tfms=[CropPad(32), Resize(224, ResizeMethod.Squish)]
)

I want to return datasets (train_X, train_y, valid_X, valid_y) from dls curated as validation and training. I didn’t find a related method / property on docs.

there is dls.items for the dataframe.

and there is dls.valid_ds for the validation set and dls.train_ds for the training set.

maybe there was also another filter for dls.items which I forgot but you should be able to filter within in dls.items too.

edit: to filter in the dataframe for example for the training set you can do:

dls.items[dls.items["column name for split in train or valid "] == False]

2 Likes

Thank you. In addition, it seems that both train_ds and valid_ds have a property named items which in turn is a dataframe.

2 Likes