LLMs can use in-context grammatical descriptions for translation, but their performance degrades significantly with grammar complexity and sentence length—suggesting limits to learning language structure from textual descriptions alone.
This paper tests whether large language models can translate between formal languages when given grammatical rules as context. Using specially designed grammar systems, researchers found that LLMs struggle with translation accuracy as grammar complexity and sentence length increase, and perform worse when source and target languages differ in structure or writing system.