"Moving past word2vec embeddings"?

Over the last year I’ve been working a lot with doc2vec and word2vec, so I was intrigued by @jeremy’s comment (here) that:

[…] in NLP I’m really pushing the idea that need to move past word2vec, glove […] because those embeddings are way less predictive than embeddings learnt from deep models […]

Could you please elaborate on this? Are you basically saying “forget about word2vec, simply start with a random embedding layer (wired into RNN or whatever) and let it converge as part of the overall network training”?


1 Like

I think this is the key part.

If you are into NLP the model introduced in lecture 4 towards the end and elaborated on in this paper is mind blowing. If I had more time or were not such a newb as I am I would immediately jump into this. But first need to do a bit more with CV in preparation for part 2 and also to get a bit of that newb out :wink:

But really, really hoping I will have a chance to play around with this in 2 maybe 3 weeks.