Parameter-efficient Transfer Learning for NLP

Stumbled upon this paper recently. Any thoughts on its effectiveness (etc)?

Parameter-efficient Transfer Learning for NLP

1 Like