Preference-based Reinforcement Learning — Glossary — ThinkLLM