Papers on using pre-trained language models to do classification

ramin · November 21, 2017, 5:40am

@jeremy said that there are recently some papers using the pre-trained language model to do classification, does anyone know what are those papers? can someone post links to those papers?

Even · November 21, 2017, 6:02am

I’m guessing the main paper he’s referring to is: https://arxiv.org/pdf/1710.02076v1.pdf

I also saw it recently discussed here: https://arxiv.org/pdf/1711.05732v1.pdf

Most of the recent embedding papers now include some examination of a few NLP tasks to compare their performance.

anamariapopescug · November 21, 2017, 6:09am

Here’s another fun paper, where one of the tasks they evaluate on is a text classification task

Even · November 21, 2017, 4:44pm

Here’s another, although based on Jeremy’s comment in another thread I’m guessing I might have missed the mark as these papers relate to embeddings and he’s training the network and using that as the input.

jeremy · November 21, 2017, 4:59pm

I linked to the main paper that uses a full pretrained model in the notebook https://arxiv.org/pdf/1708.00107.pdf . There are some more thoughts and links here http://ruder.io/transfer-learning/index.html