Combining vision and tabular models in v2

florianl · April 5, 2020, 4:59pm

Hi,

I would like to build a model, that combines e.g. image and tabular data (example dataset https://www.kaggle.com/c/petfinder-adoption-prediction). I saw a couple of notebooks for v1 but couldn’t find any for v2. Has anybody done that already? If not could you give me some directions how to build:

a dataloader that uses ImageBlock, TabularBlock, CategoryBlock (how do I provide two get_item functions? how does the dis.dataloader(path) work with two blocks?)
a model that combines the image and tabular data
I guess I have to build tow custom pytorch models, cut the heads, and add a custom head that combines the two models?!
a learner that works with the learner and model

Thanks Florian

jwuphysics · April 6, 2020, 11:03am

I’ve only done this for v1 and haven’t tried got v2 yet. However there is documentation for custom nets requiring multiple inputs, i.e., Siamese networks.

VishnuSubramanian · April 6, 2020, 11:44am

Did you try passing three different functions to getters in DataBlock.

Check this thread.

florianl · April 6, 2020, 10:27pm

thanks for the links. Now that I figured out how to pass more than two get_x (by using getters) I found that there is no TabularBlock. There is only a TabularDataloader and TabularPandas. I couldn’t figure out how to use them to build a combined Datablock or Dataloader. So if anyone has ideas on how to do that (@muellerzr maybe? ) please let me know.

muellerzr · April 6, 2020, 10:31pm

There’s a link where someone attempted to combine the Tabular with Text. You can’t data block it right away because TabularPandas isn’t a block (and isn’t really anything like what the API is, it’s kinda floating separately). I’ll find it in a moment and edit this post

Edit:

Found it @florianl

However, in general the API will let you use any number of inputs and outputs. When using the high level datablock, specify n_inp=2 for two input blocks (being the first two you pass in)

florianl · April 6, 2020, 10:45pm

Thanks! Thats a good starting point.