Hi everyone!
I’ve been working on a project where I needed to combine standard CNNs (ResNets, etc.) with Tabular metadata (categorical embeddings + continuous features). Sharing alpha stage example which may help someone:
It implements Late Fusion: running the image through the CNN body, flattening it, concatenating it with the tabular features, and running the result through a custom head.
Features
ImageTabularDataLoaders: Helper to merge existingvis_dlsandtab_dls.vision_tabular_learner: Drop-in replacement forvision_learnerthat handles the fusion head automatically.export/load: Fully supportslearn.export()for inference.
Code & Example You can see example here: Vision+Tabular FastAI Extension example · GitHub
Quick Usage:
# 1. Merge your existing DataLoaders
hybrid_dls = ImageTabularDataLoaders.from_dls(
vis_dls=vis_dls,
tab_dls=tab_dls,
vocab=vis_dls.vocab,
image_col='image_path',
y_name='label',
image_path_prefix=path/"images",
item_tfms=[Resize(224)]
)`
# 2. Train (n_tab is the size of your tabular features)
learn = vision_tabular_learner(hybrid_dls, resnet34, n_tab=20, metrics=accuracy) learn.fit_one_cycle(5)
Hope this helps anyone dealing with hybrid datasets!