LLM factual accuracy isn't random—it scales predictably with model size and training data frequency, meaning you can estimate what facts a model will reliably remember based on these two factors.
This paper reveals that LLM factual recall follows a predictable pattern based on two factors: model size and how often a topic appears in training data.