I created and published a German topic classification dataset bases on ten thousand German news articles categorized into nine classes. I thought this might be interesting for some one looking here.
I trained a German LM, fine tuned it and build a classifier on top which has a 89% test accuracy. Additionally I compared the lowshot learning part of the ULMFiT-paper to fastText, a linear SVM and a TensorFlow NN. I’ll post the results here in the following weeks.