@ilovescience highlighted in Discord that Andrej Karpathy recently released a nicely minimalist implementation of GPT, with example notebooks too

@muellerzr said I should take a look. And so here is a fastai version!

Fastai Version

Fastai version of his Play_Char notebook, training it on Shakespeare and then generating new dialogue. Happy to hear any and all feedback!

(Its not working with mixed precision yet due to a small bug in model’s forward)


This is very nice. Thank you.

1 Like

Updated version here, more fastai-like and also working with mixed precision! (Note that I had to use to_native_fp16 as to_fp16 wasn’t training well at all (maybe due to LayerNorm?)