@ilovescience highlighted in Discord that Andrej Karpathy recently released a nicely minimalist implementation of GPT, with example notebooks too
@muellerzr said I should take a look. And so here is a fastai version!
Fastai version of his Play_Char notebook, training it on Shakespeare and then generating new dialogue. Happy to hear any and all feedback!
(Its not working with mixed precision yet due to a small bug in model’s forward)
This is very nice. Thank you.
Updated version here, more fastai-like and also working with mixed precision! (Note that I had to use
to_fp16 wasn’t training well at all (maybe due to LayerNorm?)