Can anyone provide me a libraries for WordPiece tokenization?

I want to build a vocabulary based on wordpiece instead of words. can anyone tell me the process of building wordpices vocabulary from some sentences or any library that is able to do this?

You might want to try SentencePiece

Here is one example of creating wordpice tokens:

From what I can tell SentencePiece is the wordpices that were used to train BERT.

Here is a paper that you might find interesting