Pretrain & Finetune MLM - 6: Reproduce GLUE finetuning results

Richard-Wang · June 7, 2020, 1:04am

To validate our finetuning script, I trained and compared with table 8 of the ELECTRA paper. And the results is, even a little bit better than paper !

Model	CoLA	SST	MRPC	STS	QQP	MNLI	QNLI	RTE	Avg.
ELECTRA-Small	54.6	89.1	83.7	80.3	88.0	79.7	87.7	60.8	78.0
ELECTRA-Small (finetuned with fastai)	52.8	89.8	84.5	83.6	88.7	80.4	88.9	65.2	79.2
ELECTRA-Small++	55.6	91.1	84.9	84.6	88.0	81.6	88.3	63.6	79.7

Results on test test.
No ensemble, No task-specific tricks. Only choose the best one out of 10 trained models.
Actually it confuses me that is electra-small-discriminator on hugging face hub ELECTRA-Small or ELECTRA-Small++ ? (but ELECTRA-small result on GLUE benchmark is actually ELECTRA-Small++)

“Pretrain MLM and fintune on GLUE with fastai”

Previous posts.

Also follow my twitter Richard Wang for updates of this series.

Things on their way

use one_cycle and fp16 to reproduce
Pretrain ELECTRA-small from scratch
ensemble and wnli tricks (maybe)

farid · June 7, 2020, 5:02am

Great job @Richard-Wang! Thank you for sharing!

Tendo · June 7, 2020, 7:59pm

@Richard-Wang Your dedication on this so far has been amazing!! Keep up the excellent work!