Thanks for bringing this up @TomLisankie . I saw a few people using this for the Kaggle Quora competition, e.g. here and here and it sounds interesting. The article here mentioned they were beating one of the other two Siamese networks by using
xgboost and a bunch of handselected features. They also described a DL architecture afterwards which looks a little bit like a Siamese network to me with lots of extra layers.
Since this is all more than a year old I’m wondering if now with more powerful approaches such as ULMFit we can simplify this. Did you by any chance try out ULMFit with Siamese networks? Would be curios about the experience. Otherwise that might be one of the next things I’ll try.