Hi, I wrote a post about doc2vec - a nice technique for matching documents, extracting tags, etc.
I hope you’ll like it!
3 Likes
Great explanation!
Also super interesting to hear about the work you’re doing for ScaleAbout
3 questions:
-
Did you try simply averaging the word2vec vectors of the document’s words to represent that document instead of using doc2vec?
-
Have you looked at any RNN models to represent the content of a document?
-
Have you tried LDA-2-Vec?
Hi Alex, I’m really happy you liked my post!
re you questions:
- averaging all words vectors will be too noisy, since the tests are around 300 long… however, we do get good results from a 2 step model - extracting keywords and averaging their prettained word2vec.
- I did try to train Rnn on our tagged docs (same as what Jeremy did in lesson 5) but results were worse than cnn
- I did not hear about this technique. I did try using lda but didvnt get very good results.
1 Like
Hi @markovbling, I was wondering whether you had used LDA-2-Vec, and if so, what your experiences were! Cheers