Interesting text comparison via deep learning example

I found this article quite interesting. It’s a comparison of feature engineering and xgboost vs deep learning for the problem of figuring out if an article on quora is a duplicate or not. The author goes into quite a bit of detail about his architecture, which I believe is novel, and provides a link to it on his github.

Thought everyone here might find it useful as well.


Interesting. That is a very complex network indeed.
I wonder if you can get good results in this competition with something more simple, perhaps like we did in the first part of the course.
I saw a script in kaggle that may bring you to 0.35 loss with 2 feature XGBoost, while this fancy 20 layer NN brought this guy to 0.25.
Another thing that I’m thinking about is how to generalize this problem of finding identical questions, to finding similarities in longer texts, such as news articles.