How to use images and tabular data in one model?

Hello :slight_smile:

I have some images with a csv file containing some additional information about them / features and I want to do a binary classification.

My idea is to use a CNN with the image (already working) and use an other CNN with Tabular for the csv file and merge them at some layer to make a unique output.

My question is can I make a learner with two inputs?
Or should I just make my one input as a concatenation of image+tabular info and slice it as my first layer and run two branches until the merge?

Is a even possible or should I stay with two separate models and do some kind of weighted voting on the prediction ?

idea

Thank you ! :sunny:

5 Likes

It is possible for sure but you will need to write some code to integrate it in fastai.

Take a look at this solution from a 1st place on a kaggle competition:

7 Likes

I have been wondering about that for a while (in the context of NLP and tabular data). I found two resources recently that show how to do that in fastai.

  1. This fastai post points you to this great Jupyter notebook example - ConcatModel is what would be relevant

  2. In the Kaggle Quickdraw competition Radek made some great example notebooks. One of them shows how to combine 4 model architectures together (look at the MixedInputModel).

23 Likes

Thank you both that’s great !

thanks very much for the info! very useful! I am a bit curious how the author of the notebook decides to include the decoder part of the nlp & tabular model before the concatenating? i thought you only concatenate the last embedding layers of NLP and Tabular and then attach the decoder together? Does anyone have an insight on this?