This is the first benchmark for Thai text classification. I will try to find more in literature. As far as I know the previous state-of-the-art was randomly initialized LSTM at 0.58 and tranfer learning is at 0.61 (micro-F1).
The task was to classify restaurant reviews into 1 to 5 stars. I considered it sentiment analysis.
@piotr.czapla I’m trying to incorporate my works into ulmfit-multilingual but not sure what to do with tokenization and text normalization since Thai needs a different methods than the Moses tokenizer used. Do you have an idea?