A strategy that selects which tokens to generate next based on the model's prediction confidence, enabling adaptive and efficient generation.