Combine text and tabular models in fastai v2

ajka · October 11, 2020, 4:01am

I have two separate models that I have trained: one for one-hot encoded data, and one for text data. The models are defined below.

Model for one-hot encoded data:

training_data = TabularPandas(
    df=training_dataframe,
    procs=[Categorify],
    cat_names=BINARY_COLS,
    splits=splits,
    y_names=TARGET_COL,
)

training_dataloaders = training_data.dataloaders(1024)

learner = tabular_learner(
    training_dataloaders,
    layers=[256,256],
    loss_func=CrossEntropyLossFlat(),
    metrics=[accuracy],
    model_dir=MODELS_DIR,
)

test_dataloaders = learner.dls.test_dl(test_dataframe_one_hot)
probabilities, targets = learner.get_preds(dl=test_dataloaders)

Model for text data:

classifier_text_block = TextBlock.from_df(
    text_cols=[SEQUENCE_COL],
    vocab=language_model_dataloaders.vocab,
    tok=bpe_tokenizer,
    n_workers=8,
)

classifier_data_block = DataBlock(
    blocks=(classifier_text_block, CategoryBlock),
    get_x=ColReader("text"),
    get_y=ColReader(TARGET_COL),
    splitter=training_validation_split,
)

classifier_dataloaders = classifier_data_block.dataloaders(
    training_data,
    bs=128,
    seq_len=80
)

classifier = text_classifier_learner(
    dls=classifier_dataloaders,
    arch=AWD_LSTM,
    seq_len=80,
    config=lstm_clas_configuration,
    pretrained=False,
    metrics=[accuracy],
    model_dir=MODELS_DIR,
).to_fp16()

classifier = classifier.load_encoder("lstm_language_model_finetuned")

test_dataloaders = classifier.dls.test_dl(test_dataframe_text)
probabilities, targets = classifier.get_preds(dl=test_dataloaders)

I have already trained both models. Now, I want to take the output from both of those models and use those as inputs to a new model that effectively combines the binary and text models into a larger model. How do I do this?

(I read a few forum posts that seem to describe something similar, but I’m having trouble understanding them and how it applies to my situation.)

If anyone is able to help, I’d greatly appreciate it!

ajka · October 12, 2020, 12:18am

As far as I can tell, there are three different approaches I could take here:

Create a new tabular_learner. Use a TransformBlock or something similar to just call learner.predict from my one-hot model and classifier.predict from my text model on the appropriate columns of the dataframe, then pass the probabilities outputted from both models to the new tabular model as continuous variables.
- This seems like a hackish approach, but it seems like it would work as intended without too many problems. I’m going to try this first since it seems like it’s the easiest to implement.
Remove the “heads” from both models, then attach them both to a new, single head. In this way, the two models would “feed into” the same head.
- This seems like the method that’s closest to what I actually want to do, but I’m not sure how to implement this. Is this possible to do in fastai? If so, can anyone point me to resources to learn how to do this? (Or perhaps even demonstrate how I would do this with a code snippet?)
Create a new MixedTabular data block and learner, as in this example using fastai v1.
- This seems like it would work, but it’s not quite what I wanted to do. I have already trained my two “sub-models”, but this would entail training an entirely new LSTM and tabular model from scratch. (Perhaps I should consider this approach, though. I’m not entirely sure whether my approach of training the “sub-models” first is the correct approach.)

@muellerzr, it looks like you had written the post that I linked to in the above post. Do you have any thoughts about this? (Or @stefan-ai, given your expertise in NLP and tabular data?)

stefan-ai · October 12, 2020, 8:57am

Hi @ajka,

I haven’t combined tabular and text data in a model myself. I think you are right in that there is not a single solution to solving your problem.

Regarding 1: This could work if you are predicting the same classes with your tabular and text classification models. But if you have multiple text columns that you want to classify separately and then ensemble the predicted probabilities with those from the tabular model, you have to take into account how to best weight the probabilities for ensembling. E.g. if you have five text classifiers and one tabular model and simply average the predicted probabilities, the tabular model will have very small influence on the final output. So you could either take a weighted average of the predicted probabilities or feed the texts from all text columns together into one text classifier.

Regarding 2: I think the best approach here would be to get a sentence/document encoding from your text classifier and then feed this encoding along with your tabular features into a tabular model. Like above, you need to answer the question here if you want to have separate text classifiers for each text column or create a single model with the concatenated texts. In case you have separate classifiers, you could either average their resulting encodings or concatenate them before feeding them into the tabular model. Following this approach you would only need to re-train your tabular model.

Here is a blog post that shows how to get document encodings from ULMFiT (in fastai v1, but it should be fairly similar in v2): https://medium.com/@alden_6876/getting-document-encodings-from-fastais-ulmfit-language-model-a3f9271f9ecd

Regarding 3: I’m not sure about this approach. So you would like to feed both tabular and text features directly into a single model?