Understanding Data Temporality Impact on Large Language Models Pre-training

Pilchen Hippolyte, Fabre Romain, Signe Talla Franck, Perez Patrick, Grave Edouard|May 21, 2026arXiv

Key Takeaway

Training LLMs on chronologically ordered data instead of shuffled data improves their knowledge of recent facts and temporal accuracy, suggesting data ordering matters for building models that stay current.

Summary

This paper investigates how the order of training data affects what LLMs learn about time-sensitive facts. Researchers trained 6B-parameter models on chronologically ordered data versus shuffled data, and found that sequential training produces models with more current and accurate temporal knowledge while maintaining general language understanding.

training data evaluation

Key Terms

pretraining knowledge-cutoff temporal-grounding factual-freshness