I am currently working on building an image classifier that trains using data from two different dataloaders. I have a bunch of images in a dataframe with their labels all in a csv file and another set of images in labeled folders in my google drive. So, because the two images are stored in different settings, I use different datablocks to build dataloaders for each.
I’ve been trying to figure out how to get a working dataloader for the two - they both have the exact same item and batch transforms and yet I am getting errors. I’m not sure what I am missing or what the best solution might be - everything I’ve seen on the topic focuses on multi-modal problems which are a different level of complexity.
Here’s an excerpt of my current (failing) attempt -
dblock = DataBlock(blocks=(ImageBlock, MultiCategoryBlock),
splitter=splitter,
get_x=get_x,
get_y=get_y,
item_tfms=Resize(448, ResizeMethod.Pad, pad_mode=‘zeros’),
batch_tfms=aug_transforms(size=224, min_scale=0.75))
dls1 = dblock.dataloaders(df)
dblock2 = DataBlock(blocks = (ImageBlock, MultiCategoryBlock),
get_items=get_image_files,
splitter=RandomSplitter(seed=42),
get_y=Pipeline([using_attr(RegexLabeller(r’(.+)_\d+.jpg$’), ‘name’), lambda label: [label]]),
item_tfms=Resize(448, ResizeMethod.Pad, pad_mode=‘zeros’),
batch_tfms=aug_transforms(size=224, min_scale=0.75))
dls2 = dblock2.dataloaders(path2/“images”)
dls = DataLoaders(dls1, dls2)
learn = cnn_learner(dls, resnet50, metrics=partial(accuracy_multi, thresh=0.2))
learn.fine_tune(3, base_lr=3e-3, freeze_epochs=4)
Is there a thread or tutorial on this that I keep missing?