Hello all!
I am tryning to create a model mixing text and image inputs. I like the datasets + dataloaders approach because of the flexibility it provides for the inputs design. However, I am a bit confused about how before_batch, after_item and after_batch really works.
My data is as follows:
items is a list of strings that has a structure like this:
"<label>;<text>;<image filename>;<is_valid>"
Dataset and Dataloaders:
def get_item(item, index):
return item.split(';')[index]
splits=FuncSplitter(lambda o: o.split(";")[3]=='valid')(items)
dsrc = Datasets(items,
splits=splits,
tfms=[[partial(get_item, 2), PILImage.create], #image axis
[partial(get_item, 1), partial(custom_tokenizer)], #text axis
[partial(get_item, 0), Categorize()] #label
])
dls = dsets.dataloaders(bs=8, source=items, #num_workers=8,
after_item = [Resize(528, method='squish'), ToTensor()],
before_batch=[partial(pad_input,pad_fields=1)],
after_batch= [IntToFloatTensor(),
*aug_transforms(size=528,
do_flip=True,
max_rotate=15,#15.0,
max_zoom=1.1,#1.1,
max_lighting=0.3,#0.3,
max_warp=0.0,#0.2,
p_affine=1.0,
p_lighting=1.0),
NormalizeEf.from_advprop(2.0, 1.0)
],
shuffle_train=True, path=path
)
How before_batch, after_item and after_batch would “know” wich axis to be applied to?
@muellerzr, any thoughts?