How to handle imbalanced NLP data set

I am working on a data-set with around 2000 records.

Around 80% records have their own labels.

There are around 200 categories, some categories got more than 20 records; whereas others only have TWO…

Considering this is a text data-set, so I cannot do the up-sampling for minority categories with techniques like what I could do for images.

So what can I do for it?

Hey @franva if you got some helpfull solutions or tuto could you please share it with me,
I’m facing the same problem working on multi-label text classification problem with 15 classes 40% goes to class 1 and 60 % are distributed to 14 other classes.