Potential-based Reward Shaping — Glossary — ThinkLLM