Integrating pretrained huggingface transformers for language modelling

after I’ve had problems getting decent results with the default fastai transformer in language modelling, I tried to integrate a pretrained transformer from huggingface into fastai following this tutorial.
Since I am looking at language generation, I used the pretrained GPT2LMHeadModel.
I initialized a LanguageLearner with this model and without further training tried to predict text with it.
However, this produces complete nonsense. I suspect this is due to tokenization.

These are the relevant snippets from my code:

gpt2_tok = GPT2Tokenizer.from_pretrained(‘gpt2’)
fastai_tokenizer = Tokenizer(tok_func=FastAiGPT2Tokenizer(gpt2_tok), pre_rules=[], post_rules=[])
gpt2_transformer_vocab = TransformersVocab(tokenizer = gpt2_tok)

numericalize_processor = NumericalizeProcessor(vocab=gpt2_transformer_vocab)
tokenize_processor = TokenizeProcessor(tokenizer=fastai_tokenizer, include_bos=False, include_eos=False)
transformer_processor = [OpenFileProcessor(), tokenize_processor, numericalize_processor]

Since a language learner needs a databunch to be initialized, I passed it the following databunch which is created from only one file with very few lines of text.

data_lm = (TextList.from_folder(samples_path, processor=transformer_processor)

My prediction using

learner.predict(“It had been a beatiful day and”, 40, temperature=0.75)


‘It had been a beatiful day and Disclaimer Ġof Ġthe Ġthe Ġfinal Ġfirst . 23 " ĠGL . Ċ Ċ , Ġand Ġgoing Ġby Ġdid Ġto Ġanimal , Ġa Ġsix Ġteachings Ġby , Ġand Ġmore Ġof Ġthe Ġsuper Ġtracking Ġin Ġthe Ġtimes Ġof Ġhis Ġto Ġtake Ġfor’

Has anyone got any ideas on this? I suspect the Ċ is a special token which is not translated back for some reason. But even ignoring Ċ, the text is nonsense, although the model is meant to be pretrained.

P.S.: I have posted a different question about using fastai's default Transformer here.

Don’t know if you saw it, but Sylvain released a GPT2 tutorial for fastai v2 here:

Does anyone have an example of a simple multi-category classifier building on the Sylvain’s GPT2 tutorial?

I think Jeremy is working on something, but in the meantime you can have a look at for some fastai + Transformers demos