Large Language Model grammar training

bencoman · October 30, 2022, 3:29am

This post on the Stable Duffision topic

sparked a random thought… that was off-topic, so shifting to a new topic…

I was wondering, SOTA large language models that can carry on near-Turing-level chat are obviously doing a reasonable job of implicitly understanding grammar (verbs/nouns/adjectives/etc), but has anyone done explicit-grammar training as a final step?

That is, after the implicit-learning from the existing training process, another training step is added where the model is presented grammar questions as an objective function. Compare to what we do for humans… children are smart enough to learn the local language on their own, but we supplement that in primary school with explicit grammer lessons.

I googl’d around a bit, but most of the stuff I found were like this one Evaluating Syntactic Abilities of Language Models, which make no mention of improving the model by feeding their results back into it. But if you can measure it, you can train it!!

Anyone know of work in this direction?

skalyan · October 31, 2022, 9:24am

Given that grammar questions usually take the form of sentence completion, I think you could argue that the training of language models already does this!

iamholmes · October 31, 2022, 1:02pm

Given the massive corpus of data these models are trained on, I don’t think there is any need for any further explicit grammar training.
To use your example - when children learn local languages well enough that’s supplemented with explicit grammar lessons. They aren’t necessarily reading more extensively and broadly. But that’s basically what’s happening with model training on LLM.