Hi,
I posted basically the same question a while ago and received no replies (Can I somehow keep a linked copy of the original untokenized input data for the text_classifier_learner).
What I do for now is that every time I train I also do the interpretation right away and match them to each row of the data. Then I save the input texts and interpretations to a database and query that based on the interpretation later to retrieve the original.
It’s a bit of a hack, but it works reasonably well for me as long as there are not too many UNK tokens for short inputs in which case it might match to the wrong input.