NLP and historical Data

So this might be a little left-fieldy, but its a problem I’ve been thinking about and thought others might have thoughts about. I’m working on a project attempting to classify historical documents. The documents all claim to be written by the same person, but historians have expressed skepticism about a subset of the corpus. Now, this seems a straight-forward nlp classification problem. I can train a classifier on those documents that are assuredly written by the author, and then test the problematic texts against that. I’m working on trying this with

But, here’s the kicker, the reasons historians are sometimes skeptical of some of the documents in the corpus has less to do with stylistics (which I assume the language classifier approach would find) but because some of the content is historically anachronistic. I.e. the author makes reference to events that happen after the death of the person in whose name the text was written (it would be like if a document allegedly written by George Washington referenced the civil war). I don’t think the language model is going to catch that. But is there a way to do it?

I was wondering if one set up classifier based on date (categorizing texts into when they were written) and training on that, whether it might predict an authorship date that was outside the life of the author. But I wonder if others had different ideas about how to solve this problem?

If you make a new dataset where you put these references as the targets. So if a text contains a reference to civil war then the classifier has to detect that civil war is a reference and just note it. Now we do it for the complete text.
But then I am not sure how would be classify these references. Maybe manual work or a new classifier that I am not aware of.