OpenAI GPT-2 larger model released

Previously they released a model trained on a smaller data set. This is now a larger version, but still not the full 1.5B trained model. I wish my GPU had more memory :slight_smile: Should be a great starting point for NLP projects!


Anyone played with this yet?

I hypertuned it using a collection of @dril tweets. Pretty good results. Choosing randomly from my sample (I’ve checked and didn’t find any duplicates, but I could be misusing Twitter search):

  • the deer stalker is dead
  • linking my Facebook.
  • setting up the real man cave , for the real men, in my new post. man cave

Posted the code here, along with some tips for tuning the model and a link to the colab notebook I used.


There is a pytorch-port already in huggingface’s library. I played a little - no fine-tuning, just generated text for the samples in the original post without great results:

I’m currently working on reproduce the quality of the blogpost (although it maybe impossible since this version is large but not huge).

