MultiFiT English pre-trained model?

Hi all, have any of you already pretrained the MultiFiT model on English Wikipedia data, using the method proposed in the paper by Julian Eisenschlos, Sebastian Ruder, Piotr Czapla, Marcin Kardas, Sylvain, and Jeremy of the fast.ai community?

(MultiFiT is an improved, more efficient version of ULMFiT, with subword tokenizaton, QRNNs, 1cycle policy, label smoothing, etc. It seems that pretrained models are only available in other languages in the official repo.)

If you have pretrained it and you could share the weights, it would be much appreciated. Thanks in advance!

I’m planning to do some fine-tuning experiments with it with different resource constraints, on English classification datasets, comparing against other models (e.g. ULMFiT, BERT). If noone has pretrained this model on English Wikipedia yet, I’ll try to do so, although I have limited hardware access currently.

3 Likes

I’m interested in following the discussion, but to my knowledge MultiFiT should be used on Multi-lingual tasks yes? (Languages outside of English)

1 Like

I think the paper focuses on multilingual applications, but the method is not limited to it. It demonstrates a monolingual training approach like ULMFiT, where the language model is pretrained on a (non-English) language Wiki, then fine-tuned on a classification dataset of the same language, and finetuned with a classification head on the same dataset (this paper also has a cross-lingual approach though). So I think the monolingual approach is equally relevant for English and any other languages.

How have you got on with this @vharg ? I’ve just completed an exercise of evaluating ULMFiT on tweets and found it did not perform as well as I hoped, I believe the architecture of MultiFiT would be better for the short idiosyncratic text of tweets - so I am very interested if anyone has created an English-based pre-trained version!

1 Like

Can someone share MultiFit collab version as it is throwing many python problems

exp = multifit.from_pretrained(f’{lang}_multifit_paper_version’)