@ilovescience
thanks for the pointer! The get_dls
from your repo is the reference I am looking for!
You use Datasets
instead of DataBlock
to build the data batch. RandPair(filesB)
is used to randomly pair up with image in domain B.
I also found this Siamese tutorial pretty relevant. Right now, I tried to built my fastai data from torch.utils.data.Dataset
following this tutorial. However, such approach disable show_batch
/ show_results
I would like to take a step further to build the same in a more fastai way. (e.g. from mid-level API). I would take your get_dls
as a good reference! Seems get_dls
currently doesnt have train/ valid partition. I would like to have different pair up scheme depending on train and valid set (e.g. for valid set, an image in A always pair up with the same image in B)
def get_dls(pathA, pathB, num_A=None, num_B=None, load_size=512, crop_size=256, bs=4, num_workers=2):
"""
Given image files from two domains (`pathA`, `pathB`), create `DataLoaders` object.
Loading and randomly cropped sizes of `load_size` and `crop_size` are set to defaults of 512 and 256.
Batch size is specified by `bs` (default=4).
"""
filesA = get_image_files(pathA)
filesB = get_image_files(pathB)
filesA = filesA[:min(ifnone(num_A, len(filesA)),len(filesA))]
filesB = filesB[:min(ifnone(num_B, len(filesB)),len(filesB))]
dsets = Datasets(filesA, tfms=[[PILImage.create, ToTensor, Resize(load_size),RandomCrop(crop_size)],
[RandPair(filesB),PILImage.create, ToTensor, Resize(load_size),RandomCrop(crop_size)]], splits=None)
batch_tfms = [IntToFloatTensor, Normalize.from_stats(mean=0.5, std=0.5), FlipItem(p=0.5)]
dls = dsets.dataloaders(bs=bs, num_workers=num_workers, after_batch=batch_tfms)
return dls