Reinforcement Learning from Internal Feedback (RLIF) — Glossary — ThinkLLM