LLMs perform better when trained on and prompted with more frequently-occurring textual patterns, similar to how humans read faster with common words—this simple principle can boost performance across multiple tasks.
This paper studies how word frequency in text affects large language model performance. The authors propose three techniques: using more frequent phrasings in prompts, generating training data with common expressions, and training models on increasingly frequent text. Tests on math, translation, reasoning, and tool-use tasks show these frequency-based approaches improve results.