Merging two datasets

I have two very similar datasets, say they are ImageSet1 and ImageSet2.

ImageSet1 has labels in a CSV while ImageSet2 I’m getting out of regex.

I want to follow something like you see in this article explaining taking two datasets and combining to have train, bridge (possibly) and my valid dataset. I’m not sure what is the best approach for doing this.

My goal is to combine both datasets, so that that I build a better model/learner for my test set from imageSet1.

I would probably suggest combining the datasets into a common layout in your file system BEFORE you train. For example, could you create a CSV for other image set?

Is it possible for fastai to train with two sources (like via the Dataloader)? I’m not sure…
Good luck, report back with your findings!

I definitely could. I guess what I’m wondering is if there should be a column perhaps noting ImageSet1 and ImageSet2. Mostly because I’m going to be testing this model ultimately against ImageSet1.

One way is have a dataframe with all the image paths. Then have a validation column (0/1) to pick which images to validate with.