Vision_Tabular: Simple Hybrid (Image + Tabular) Models for FastAI V2

Hi everyone!

I’ve been working on a project where I needed to combine standard CNNs (ResNets, etc.) with Tabular metadata (categorical embeddings + continuous features). Sharing alpha stage example which may help someone:

It implements Late Fusion: running the image through the CNN body, flattening it, concatenating it with the tabular features, and running the result through a custom head.

Features

  • ImageTabularDataLoaders: Helper to merge existing vis_dls and tab_dls.
  • vision_tabular_learner: Drop-in replacement for vision_learner that handles the fusion head automatically.
  • export/load: Fully supports learn.export() for inference.

Code & Example You can see example here: Vision+Tabular FastAI Extension example · GitHub

Quick Usage:

# 1. Merge your existing DataLoaders
hybrid_dls = ImageTabularDataLoaders.from_dls(
    vis_dls=vis_dls,
    tab_dls=tab_dls,
    vocab=vis_dls.vocab,
    image_col='image_path',
    y_name='label',
    image_path_prefix=path/"images",
    item_tfms=[Resize(224)]
)`
# 2. Train (n_tab is the size of your tabular features)

learn = vision_tabular_learner(hybrid_dls, resnet34, n_tab=20, metrics=accuracy) learn.fit_one_cycle(5)

Hope this helps anyone dealing with hybrid datasets!

2 Likes