TLDR : I would like to update only some rows of an embedding matrix for words that are out of vocab and keep the pre-trained embeddings frozen for the rows/words that have pre-trained embeddings.
It seems sensible (as is common practice) to initialise out of vocab word vectors to something other than random but it seems natural to try to improve on this by then training only the rows of the embedding matrix for out of vocab words (if it’s too expensive to train the whole embedding matrix). How would this work in PyTorch? I’m not aware that it’s been done before.
i.e. do something like:
self.embedding.weight.requires_grad = False self.embedding.weight[mask, :].requires_grad = True
mask is a boolean indicating if a word is out of vocab or not.
For reference, I’ve also asked this question here.
Many thanks in advance and happy new year!