Hi, FastAI friends,
Lost in the Data Block API again, I turn myself to you for help. I’m creating a model that generates controls for facial landmark estimation. In pratice, I have a neutral face composed of 68 points, without deformations nor rotation. Upon observation of another set of 68 points (ie, another face, but this time twisted and rotated), my model returns a set of control that I use to deform the neutral one. I then compare the difference between the two faces and use this as a loss. At least that’s what I wanna do. Indeed, I am lost again in the data block, and I pledged myself to write my own pipeline this time, to have a better understanding of the library (even though I know V2 is coming).
Anyway, here’s what I’m doing:
In a top down approach, I first created the databunch class:
class FaceDataBunch(DataBunch):
@classmethod
def create(cls, train_ds, valid_ds, test_ds=None, path:PathOrStr='.', no_check:bool=False, bs=64, val_bs:int=None,
num_workers:int=0, device:torch.device=None, collate_fn:Callable=data_collate,
dl_tfms:Optional[Collection[Callable]]=None, bptt:int=70,
preloader_cls=None, shuffle_dl=False, transpose_range=(0,12), **kwargs) -> DataBunch:
datasets = cls._init_ds(train_ds, valid_ds, test_ds)
preloader_cls = MusicPreloader if preloader_cls is None else preloader_cls
val_bs = ifnone(val_bs, bs)
datasets = [preloader_cls(ds, shuffle=(i==0), bs=(bs if i==0 else val_bs), bptt=bptt, transpose_range=transpose_range, **kwargs)
for i,ds in enumerate(datasets)]
val_bs = bs
dl_tfms = [partially_apply_vocab(tfm, train_ds.vocab) for tfm in listify(dl_tfms)]
dls = [DataLoader(d, b, shuffle=shuffle_dl) for d,b in zip(datasets, (bs,val_bs,val_bs,val_bs)) if d is not None]
return cls(*dls, path=path, device=device, dl_tfms=dl_tfms, collate_fn=collate_fn, no_check=no_check)
@classmethod
def from_folder(cls, path:PathOrStr, extensions='.npy', **kwargs):
files = get_files(path, extensions=extensions, recurse=True);
return cls.from_files(files, path, **kwargs)
@classmethod
def from_files(cls, files, path, processors=None, split_pct=0.1,
vocab=None, list_cls=None, **kwargs):
list_cls = FaceDataList
src = (list_cls(items = files, path = path, processor = None)
.split_by_rand_pct(split_pct, seed=6))
src = src.HOW_TO_LABEL() ???
return src.databunch(**kwargs)
So, to function, this needs the FaceDataList:
class FaceDataList(ItemList):
_bunch, _processor, _label_cls = FaceDataBunch, FaceDataProcessor, FloatList
def get(self, i):
filename = super().get(i)
obj = self.open(filename)
return obj
def open(self, fn):
return np.load(fn)
Finally, I created a FacePointItem (which, I discovered latter is almost a clone of a FloatItem)
class FaceData(ItemBase):
def __init__(self, points):
self.points = points
self.data = torch.tensor(points.reshape(-1)).float()
def __str__(self):
return '{}'.format(np.copy(self.points).reshape(-1,3))
def to_one(self):
return np.copy(self.points).reshape(-1,3)
So far, this seems to return effectively the src
variable, in the databunch creation. However, I have no idea how to label this. I mean, the label are the initially loaded points, pretty much like an encoder, but how can I give this to the databunch ? I wanted to create an auto_label()
function in the FaceDataList class, but I can’t figure out the inputs nor the outputs needed.
Could someone put me in the right direction ? Thanks a lot !
Thanks a lot !