Language Model Zoo 🦍

s.tsuruno · December 12, 2018, 4:30am

Hi.

I pretrained a language model for Japanese (including sentencepiece tokenization). > Thank you @piotr.czapla for your code and kind direction.
Details are here.
I’ve used the pretrained model for classification of disease-related tweets (MedWeb dataset) and achieved F1micro = 0.89, which is 0.03 points below SOTA.

I’ll post updates when our repo is ready.