Time delay between generating consecutive tokens during LLM inference, critical for real-time applications.