Feeding Data into Data Block with Multiple Image Inputs

My ultimate goal is for learn.predict to accept a list with 2 image paths and return a categorical prediction.

Currently, I have this:

getters = [
           get_image,
           get_image_2,
           ColReader('class'),
]

dblock = DataBlock(blocks=(ImageBlock, ImageBlock, CategoryBlock),
                    getters=getters, 
                    splitter=IndexSplitter(indxs),
                    item_tfms=[Resize(224)], n_inp=2)

dls = dblock.dataloaders(df, bs=24)

Which trains just fine. My get_image_2 method basically uses the dataframe row passed in to pick a second image from the dataframe (using conditional logic combined with random choice). The problem is that the predict method needs a dataframe row as input, but I would like it to take two image file paths to make a prediction. I am aware of all the siamese examples, but don’t feel like that much customization should be necessary for this use case. I would also prefer not to pre-generate the pairs for training since the method I am trying to use here creates a larger variety of pair combinations.

Why can’t you just create a simple DataFrame with a single row, with the two paths, and then feed that into the predict function?

I can, and that’s what I’m going to do if I can’t find a better solution. However, I would prefer not to create all the pairs up front in the dataframe to train. Being able to choose the second image somewhat randomly as training proceeds would be better. I don’t see how you separate the choosing of images at training time versus how to choose them at prediction time here.

Above you indicate your problem is with the predict method.

If the problem is w/r/t to training, why use a DataFrame at all if you want to work with file paths? That would solve your problem of having to define the pairs up front.

Outside of that, you may consider using two DataFrames with columns: image_fpath, label … and feed them both in as a tuple to your get_items. The first DataFrame represents your primary training data and the second is merely used for pairing it with another image. You can then control the pairing in your get_x method or else use a Transform to do that.

1 Like

Your suggestions sound intriguing… I don’t see how I could do it the way you suggest with the file paths though. How would I dynamically grab another image based on some conditions? Luckily the conditions are derived from the path of the first image like the pets dataset. I do believe this should work as I stated in the question, I just can’t seem to make it work. This is basically the siamese problem I’m trying to solve without creating a lot of custom classes like the siamese tutorials do.

Hi Bob, were you able to solve your problem? I am working on similar use case with 2 images as input. Can you share your data loading code if possible?

1 Like