In the course examples of Sentiment classification, it was always classification for 2 classes (Positive and Negative ones), which usually brought quite good results of 90 percent accuracy or higher.
I wonder how much the accuracy decrease if we need to classify texts for multiple classes, like 5 or more and not only for 2 classes?
Are there such examples using ULMFit Transfer learning?
Another question: What is the estimation for an order of magnitude for minimal number of examples to be able to train our system for classification? Movie Sentiments are using 50K examples (together for training and testing), can we succeed with much less?
I did a news bias detector that involved 11 classes. I got about 93% accuracy. It depends on the dataset really. And what you want to do with it.
I want to say my dataset was about 5k rows due to the amount I could load on the GPU/CPU at the time
Thanks for the reply.
Do you mean that you succeed to classify 11 different type of news with 93% accuracy?
I think it is great achievement, a way beyond of the current state of the art.
What do you mean by 5K rows? Did you have 5000 examples of annotated news articles? For 11 classes together?
Thank you! Yes, it was a toy problem I made in 48 hours. Plenty of improvement could be made. I had the raw text of 5,000 news articles each with one label.
Interesting! A different dataset than i wound up using, but cool!