Token-Level Reward — Glossary — ThinkLLM