Classification using both image and text


(Edward Atkins) #1

I have a dataset of images and text descriptions that describe a certain object that I want to classify. In my specific case each datapoint is a picture of a house and a paragraph description of that house, and I want to classify whether that house needs renovation (i.e. binary classification).

I was wondering if anyone had any code they could share on how to combine an image learner and a text learner to provide one classification?

I’ve done something like this in Keras before, but it would be great to utilise fastai’s transfer learning capabilities individually for the image and text components.


(Navneet Kumar Chaudhary) #2

Hey Edward,
The task that you want to do is called Image Captioning. You can follow the topic on github and find some good repositories from here.
Topic - https://github.com/topics/image-captioning

One good Repo using PyTorch for Image Captioning.
https://github.com/sgrvinod/a-PyTorch-Tutorial-to-Image-Captioning
This usually based on Encoder Decoder models in which images are used to Encode features and the features are then used by decoder models to generate text description of the images.
I would love to help and learn with you.


(Edward Atkins) #3

Thanks for your reply @navneetkrch. But I have edited my question to clarify that I want the text and image to both be inputs to a binary classification problem


#4

This is something I am interested in as well. @ecatkins have you managed to make any progress?


(Navneet Kumar Chaudhary) #5

This seems to be the relevant dataset, you will find a lot of kernels as well.


They have provided Image meta data, and text description of the pets that needs to be adopted.
I hope this helps.