How to answer: who wrote this? stylometric analysis & authorship attribution?

Anyone know if one can use the fast.ai library to do authorship attribution / stylometric analysis to ultimately answer – who wrote some text?

Got a project involving:
text files with some disputed authorship
text files where we 100% know the authors that can be used to train stylistic models

If anyone can answer whether this can be done via fast.ai or has links to a ‘how to’ guide or post (or anything) that would be awesome.

Thanks so much!:+1::pray:

I’m afraid I don’t have anything in the way of a how-to, as I haven’t approached this problem myself. But I believe this is similar to the task of identifying faces - there are many possible classes and few examples per class. In that domain, Siamese networks are often used. Perhaps you might look into constructing a Siamese architecture based on a fastai language model?

There was a post a few days ago about implementing Siamese CNNs in Fastai. Per Jeremy’s response, it is a matter of creating a custom head. Maybe this will be a lead you can follow.

FastAI is too abstracted... how to create custom networks?

Hi,
Did you find anything regarding the project?
I tried Text Classifier Learner after making a language model on my dataset. But it doesn’t perform very well.