Retro transformer model with fast.ai

Hi!

I’m convinced that language models of the type proposed by the Deepmind Retro paper ([2112.04426] Improving language models by retrieving from trillions of tokens) will quickly win over “blind” LLMs.

So far I only found this PyTorch implementation: GitHub - lucidrains/RETRO-pytorch: Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch but I’d love to work on a from-scratch fastai based version.

Has anyone here researched or implemented a retrieval-based transformer LM using fastai?

2 Likes