Combining two ImageDataBunch

Hello everyone!

I have two databunchs:

data1 = (src1.transform(tfms, size=128).databunch(bs=64).normalize(imagenet_stats))

data2 = (src2.transform(tfms, size=128).databunch(bs=64).normalize(imagenet_stats))

There is a way to merge those two ImageDataBunch? Combine both data in one variable alldata?

I’ll use the combined data to feed the model like this:

learn = cnn_learner(alldata , models.resnet34, pretrained=False, metrics=[f_score])

Is it possible? I need to combine different images from multiple folders, that’s why I’m trying to do this.

Thanks in advance!

1 Like

You should use methods of the data block API to combine your sources. There is an add something that will merge two ItemList IIRC.

2 Likes

@sgugger

I have three data-set which I created using data block API:

  1. 1st Data:

data_wiki = ImageList.from_df(df_age, path, cols=['full_path'], folder = '.').split_by_rand_pct(0.2, seed=42).label_from_df(label_cls=FloatList).transform(tfms, resize_method=ResizeMethod.CROP, padding_mode='border', size=224).databunch(bs=64*4,num_workers=0).normalize(imagenet_stats)

  1. 2nd Data:

data_utk = ImageList.from_folder(path).split_by_rand_pct(0.2, seed=42).label_from_func(extract_age, label_cls=FloatList).transform(tfms, resize_method=ResizeMethod.CROP, padding_mode='border', size=224).databunch(bs=64*4,num_workers=0).normalize(imagenet_stats)

  1. 3rd Data:

data_appa = ImageList.from_csv(path, csv_name = '/kaggle/input/appa-real-face-cropped/labels.csv', folder = '.').split_by_rand_pct(0.2, seed=42).label_from_df(label_cls=FloatList).transform(tfms, resize_method=ResizeMethod.CROP, padding_mode='border', size=224).databunch(bs=64*4,num_workers=0).normalize(imagenet_stats)

All three have different ways to create the ImageDataBunch (one from csv, one from df and one from folder). However, image sizes, batch sizes, tfms are same.

Can I add these three together and train my model?

1 Like

I’m also wondering about this - I have two different image datasets that were taken under different conditions that result in different colorations. I’d like to normalize the two sets seperately and then combine them into one set to train on as a whole, is there a straightforward way to do that?

2 Likes

HI,
Have you found a solution? I also have the same problem …

IIRC look at from_lists in the API.

Excuse me, I do not find the function you mentioned, please direct me more, if you can