I can’t get my data retrieval procedure to work.
My main folder (PATH) contains a ‘train’, ‘valid’ and ‘test’ folder. The labels are in a .csv file (‘labels.csv’) in the same main folder. The .csv file has two columns: filenames without extension (’.tif’) and labels.
I made a chain of methods like below:
data = (ImageItemList.from_folder(PATH, extensions='.tif')
.use_partial_data(sample_pct = .1, seed= 34)
.label_from_df(pd.read_csv('labels.csv'))
.random_split_by_pct(valid_pct=0.2, seed=34)
.transform(tfms, size = 96)
.databunch(bs=64)).normalize(imagenet_stats)
In an earlier example on the forum label_from_csv
was mentioned . I can’t find this method in the docs.
Essentially I’m trying to translate the following (which works):
data = ImageDataBunch.from_csv(PATH, folder='train', test='test', csv_labels='labels.csv', suffix='.tif', valid_pct = 0.2, ds_tfms=tfms, size=sz, bs=bs
).normalize(imagenet_stats)
In addition I want to include use_partial_data
.
How do I retrieve the data and label the files, using folders and a .csv file?
Any ideas/examples? Thanks