Differentiable Reward — Glossary — ThinkLLM