In my particular case, I need to build a classifier(s) for sentiment analysis to determine if a given piece of text represents something positive/negative, a threat, a suggestion, a complaint, and/or is nonsense.
I’m thinking about building a separate classifier for each (e.g., one for determining if the text is positive or negative, one for determining if it is a threat, etc…) since each piece of copy can have multiple characteristics.
My questions are two-fold:
-
Specfically, should I attempt to accomplish this using character embeddings or word embeddings?
-
Generally, what kind of NLP problems are best suited to looking at characters one-by-one vs. looking at words, and vice-versa?
Based on what I infer from the course, I’m inclined to believe that word embeddings are more appropriate to sentiment analysis while character embeddings are more appropriate for predicting things like the next character or generating a bunch of similarly worded text based on a sample.
If anyone has more specific NLP resources they could share that would be great. Most of my work-work seems to be going in this direction and I’d really like to understand what architectures work best for one scenario over another.