UMLFit text_classifier on sentence pairs

Has anyone tried ULMFit text_classification using text input pairs?
Instead of single text classification by sequencing output of encoder and PoolingLinearClassifier we would have separate encoders for each text thats part of input and concatenate the output of encoders before passing it to PoolingLinearClassifier.
Thoughts?
We could hack it by concatenating the two texts as single stream and fitting a simple text_classifier_learner but it would be harder for model to know the separation of the texts in this mode.

1 Like

Haven’t tried it but this seems relevant:

see LinearComparator

Thread discussing results: What does the encoder actually learn? 🤔

1 Like

Bumping this.
Any simple way to build a text classifier having two separate texts as input?

I’d recommend looking at some of the winning solutions from Kaggle’s recent QUEST Q&A competition: https://www.kaggle.com/c/google-quest-challenge/discussion/129844

Some of the winners had 2 BERT models which combined the last BERT layer of each into another linear layer or two. One BERT model was trained on questions, the other trained on answers. Some have shared their code too if you need inspiration.