I have a dataset of images and text descriptions that describe a certain object that I want to classify. In my specific case each datapoint is a picture of a house and a paragraph description of that house, and I want to classify whether that house needs renovation (i.e. binary classification).
I was wondering if anyone had any code they could share on how to combine an image learner and a text learner to provide one classification?
I’ve done something like this in Keras before, but it would be great to utilise fastai’s transfer learning capabilities individually for the image and text components.