A token inserted during generation (e.g., <10.6 seconds>) that helps a model track elapsed speaking time.