Interesting text comparison via deep learning example

Even · March 23, 2017, 11:59pm

I found this article quite interesting. It’s a comparison of feature engineering and xgboost vs deep learning for the problem of figuring out if an article on quora is a duplicate or not. The author goes into quite a bit of detail about his architecture, which I believe is novel, and provides a link to it on his github.

https://www.linkedin.com/pulse/duplicate-quora-question-abhishek-thakur

Thought everyone here might find it useful as well.

shgidi · March 24, 2017, 12:16pm

Interesting. That is a very complex network indeed.
I wonder if you can get good results in this competition with something more simple, perhaps like we did in the first part of the course.
I saw a script in kaggle that may bring you to 0.35 loss with 2 feature XGBoost, while this fancy 20 layer NN brought this guy to 0.25.
Another thing that I’m thinking about is how to generalize this problem of finding identical questions, to finding similarities in longer texts, such as news articles.