Papers on using pre-trained language models to do classification

@jeremy said that there are recently some papers using the pre-trained language model to do classification, does anyone know what are those papers? can someone post links to those papers?

1 Like

I’m guessing the main paper he’s referring to is: https://arxiv.org/pdf/1710.02076v1.pdf

I also saw it recently discussed here: https://arxiv.org/pdf/1711.05732v1.pdf

Most of the recent embedding papers now include some examination of a few NLP tasks to compare their performance.

2 Likes

Here’s another fun paper, where one of the tasks they evaluate on is a text classification task

2 Likes

Here’s another, although based on Jeremy’s comment in another thread I’m guessing I might have missed the mark as these papers relate to embeddings and he’s training the network and using that as the input.

I linked to the main paper that uses a full pretrained model in the notebook https://arxiv.org/pdf/1708.00107.pdf . There are some more thoughts and links here http://ruder.io/transfer-learning/index.html

4 Likes