Hi there,
I want to find a word embedding, for certain set of documents, (on which I will perform some operations later). Is that better to pre train model on a bigger corpora, like wikipedia dump?
Thanks, regards.
Hi there,
I want to find a word embedding, for certain set of documents, (on which I will perform some operations later). Is that better to pre train model on a bigger corpora, like wikipedia dump?
Thanks, regards.
Yes, the general approach recommended by fastai is to pretrain on a large corpus (like wikipedia) and then finetune on your target corpus.