Vocabulary Complexity?

I have started messing around with a new Kaggle project and am exploring the idea of ‘vocabulary complexity’. I am initially wanting to look at features such as word and sentence length as well as syllable distribution. I found the Pattern library and it looks very promising for this.

Does anyone have any NLP experience using this type of analysis? Here is my current Kaggle notebook for reference.

Thanks!
Birch

1 Like