Hi, first of all, thanks to this great community. This is actually my first time, and I’m so happy to be part of these fantastic discussions.
I work on a project which mainly contains text generation in the Persian language. I have been working on different architectures, including AWD-LSTM, GPT-base, and so on. the metrics that I’ve used (as I saw in fastai courses) were loss, perplexity, and accuracy.
I also tried fine-tuning models on downstream tasks, but that could take some time.
I wonder if there are other metrics specific to text generation to evaluate different architectures base on the same data set?