BPE, fasttext, and other word embeddings

md1630 · October 15, 2019, 12:32am

In the NLP course, a couple of different word embeddings are used. In the attention model, pretrained vectors from fasttext are used. In transformers, it is stated that we used “traditional pytorch embeddings”, although pretrained vectors can also be used.

As I branch out to other transformers models and NLP libraries out there, it seems like BPE is commonly seen as the most effective, as least for neural translations. Is that right?