ULMFIT - Kannada

(Gaurav) #1

Starting this thread to share the progress on the Kannada LM and classification results @piotr.czapla @Moody

Repository: NLP for Kannada

Dataset

Results

Language Model

on 20% validation set

  • Perplexity of language model: ~70

Classifier

  • Accuracy of classification model: ~94%
  • Kappa score of classification model: ~90

Pretrained Language Model

Download pretrained Language Model from here

Classifier

Download classifier from here

Tokenizer

Trained tokenizer using Google’s sentencepiece

Download the trained model and vocabulary from here

2 Likes

Language Model Zoo :gorilla: