Word2Vec Datasets

Hey all,
I am having trouble finding Word2Vec datasets to download. Some of the articles talking about them have deadlinks.
I am trying to find a data set trained on a respectable amount of data with 100 features.
All the ones I came across use 300 dimensions or more… :frowning:

1 Like


I’m a bit late in responding but I hope this will be helpful to others. Its hard to download as the pre-trained word2vec embeddings are stored in the authors gdrive. However you can download a version from AWS S3 from the dl4j distibution.

$ wget -c “https://s3.amazonaws.com/dl4j-distribution/GoogleNews-vectors-negative300.bin.gz