I ultimately decided to simply follow the strategy presented here for teacher forcing. It just seemed a bit more straight-forward for my purposes than writing a bunch of custom DataBlock API code to achieve the same results.
Here’s my initial take on using the API to build seq2seq friendly datasets/dataloaders. If you have any suggestions on how it may be improved and/or how it could (should) be modified to include the ‘targets’ to support teacher forcing, I’m all ears.