Language models don't learn randomly during pretraining; they follow a structured, predictable curriculum where complex skills emerge from simpler ones in a consistent order across models, which you can predict from their internal representations.
This paper reveals that language models learn skills in a predictable, compositional order during training—like following a hidden curriculum. By testing models on simple tasks (retrieval, logic, math), the researchers found that the order skills emerge is remarkably consistent across different models and sizes, and that composite skills emerge after their building blocks.