Combining text and image into one model

Hello,

I’m making classification for a lot of classes (about 500), and in my dataset I’ve got text (like opinion) and image of item. I want to combine this two features to get better results. Is there possibility to do this using fastai library? Or maybe I should make two separate models and later combine outputs of them in some way?

Best regards

I want to combine this two features to get better results

This is excatly what i’m looking to do also. I would like to be able to add some context for the images, in my case i would like to add time of day and ambient temperature. Allot of what helps the human perception i think, is to combine the context of the situation with what the eyes are seeing.

One way i could think off doing this woud be a preprocessing of the image data where it is padded with some sort of color scale that changes depending on temperature, time, opinion etc. image

I haven’t tried it, but it should be possible i think

In the Rossmann example, calendar (weekend or holliday) and weather info are married to the Rossmann data before training the model. You might look at that. Turning your metadata into a color code sounds more complicated to me that just attaching it to your images.

You probably have better programming skills than I do. Mine are not stellar. I am most effective at taking a working piece of software apart, substitute my data and fiddle around until it stops breaking.