A test-time process where a verifier scores candidate solutions and the model refines low-scoring ones iteratively.