How to expand vocab of saved model


I wonder, how can I expand a vocab of my pre-trained model.

  1. I trained and saved the model.
  2. I have a new dataset with expanded vocab.
    3 . I would like to load the previous result, but I can’t due to different vocab.

Thank you in advance.


I do not think there is an easy way to do it. As the vocab is directly tied to the embedding of the model. There is a small workaround, I remember reading it somewhere.

You can create a new vocab and initialize a model with the new vocab. But replace the random embedding of the model with embeddings from the old model. And train the model, it will look like transfer learning.

Another workaround is to create a common vocab from both the datasets ahead of time.