Previously they released a model trained on a smaller data set. This is now a larger version, but still not the full 1.5B trained model. I wish my GPU had more memory Should be a great starting point for NLP projects!
Anyone played with this yet?
I hypertuned it using a collection of @dril tweets. Pretty good results. Choosing randomly from my sample (I’ve checked and didn’t find any duplicates, but I could be misusing Twitter search):
- the deer stalker is dead
- linking my Facebook.
- setting up the real man cave , for the real men, in my new post. man cave
Posted the code here, along with some tips for tuning the model and a link to the colab notebook I used.
There is a pytorch-port already in huggingface’s library. I played a little - no fine-tuning, just generated text for the samples in the original post without great results:
http://www.kaggle.com/julian3833/gpt2-large-774m-w-pytorch-not-that-terrifying
I’m currently working on reproduce the quality of the blogpost (although it maybe impossible since this version is large but not huge).
It’s finally here!
I’ve also been anticipating this release.
Check it out