Auxiliary training for seq2seq models

I’m working on implementing something very similar to this (Dynamic Multi-Level Multi-Task Learning for Sentence Simplification) but on a character level.

but I was wondering if anyone has already done something along the lines of a fastai learner that does interleaved minibatch training for various auxiliary tasks. The structure I’m using at the moment has a shared encoder and two decoders, but I’ve been interleaving at the epoch level which leads to a bit more divergence across the two tasks than I’d like.