Suppose that for each example, we have both image and tabular data. For the image, we can use a CNN-based model, and for the tabular data, we can use embeddings and full-connected layers. With fastai, it is easy to build two separate models for each type of data.
But what if we want to build a single model? My thought is that, for each sample, we load its image and tabular data and feed them to a CNN and fully connected neural network, respectively. We then concatenate the outputs and feed them into another fully connected neural network to generate the final outcomes.
I was wondering how to best implement this using fastai. I imagine that we first need to build a custom Data type or DataBunch that can load both image and tabular data and then a model that integrates three neural networks. I would love to hear your thoughts. Thanks!