For a project I want to apply a filter to the image data.
The filter should select only images with certain labels in the accompanied .csv file.
How can I achieve this?
Before filtering I retrieved the data like this:
data = (ImageList.from_csv(PATH, folder='train', csv_name='labels.csv', cols='image_id')
.use_partial_data(sample_pct = .1, seed = 31)
.random_split_by_pct(valid_pct=0.25, seed = 29)
.label_from_df(cols='dx')
.transform(tfms, size = 224)
.databunch(bs=64)).normalize(imagenet_stats)
My current idea is to list all needed files with Pandas. Then compare for each item in data.items
if the item is present in the earlier created list. Somehow I think there is an easier way using either filter_by_func
or label_cls
. Any tips?