@jeremy : I’ve been working on the semantic similarity task using the kaggle Quora duplicate pairs dataset.
It’s a classifier, not seq2seq, but I didn’t want to train from scratch. So I used the embedding layer weights from our LM and I’m training the rest of the model as usual. I understand it’s not the same as using our LM backbone but it’s still a head start. I am planning to use our awd-lstm backbone and try the same exercise soon.
Do you think using our own english wiki embeddings would fare better in the fr-en translation as opposed to using fasttext en word embeddings?