arka
(Arka Nayan)
1
The code here in lesson2-image_models.ipynb seems to use the whole dataset by default.
def get_data(sz):
tfms = tfms_from_model(f_model, sz, aug_tfms=transforms_top_down, max_zoom=1.05)
return ImageClassifierData.from_csv('', trainc_dir, label_csv, tfms=tfms, val_idxs=val_idxs, test_name=testc_dir)
data = get_data(sz)
learn = ConvLearner.pretrained(f_model, data, metrics=metrics)
How to use only a subset of data? I don’t seem to find anything relating that.
GeoH
(Geoffrey)
2
Look my example:
np.random.seed(42)
data = (ImageDataBunch.from_folder(path=path, train=path_images,valid_pct=0.2)
.use_partial_data(0.1)
.split_by_rand_pct()
.label_from_folder()
.transform(tfms,size=128)
.databunch(bs=64)
.normalize(imagenet_stats))
I use .use_partial_data(0.1) to use only 10% of my datasets:
See also:
https://docs.fast.ai/data_block.html#ItemList.use_partial_data