Combining text and image into one model

Hello,

I’m making classification for a lot of classes (about 500), and in my dataset I’ve got text (like opinion) and image of item. I want to combine this two features to get better results. Is there possibility to do this using fastai library? Or maybe I should make two separate models and later combine outputs of them in some way?

Best regards

I want to combine this two features to get better results

This is excatly what i’m looking to do also. I would like to be able to add some context for the images, in my case i would like to add time of day and ambient temperature. Allot of what helps the human perception i think, is to combine the context of the situation with what the eyes are seeing.

One way i could think off doing this woud be a preprocessing of the image data where it is padded with some sort of color scale that changes depending on temperature, time, opinion etc. image

I haven’t tried it, but it should be possible i think