Pre-trained word vectors for 90 languages, trained on Wikipedia using fastText

I’m still doing the CNN part of the course, but I’m pretty sure these might be useful for transfer learning in NLP when I get so far:

These are a few years old now, but you can also get pretrained vectors for Word2Vec on 100 billion words from Google News (vocab size 3 million, 300 dimensional vectors):

https://code.google.com/archive/p/word2vec/

Hey,

they updated the fastText model for 294 languages + added two tutorials:

Bests,
Benedikt

1 Like