I have been trying to build a multi label text classifier for determining the tone of a given email. However most of the labelled samples have a neutral tone and that’s precisely the problem I am faced with. Is there any way to address this class imbalance. Thanks.
Any progress on this? I am also looking for a solution for Text Classification where labelled dataset is biased towards one label.
Nope. The closest I could come was to use ImbLearn. But it does not yield good results
I have been facing the same issue with imbalanced dataset. In particular, I was trying to predict the rating (label) from amazon reviews. The classifier was really bad for groups with small reviews … :(.
No idea how, this can be fixed!
I have heard of this loss function call focal loss. Maybe that would be of some help