Self-Distilled Agentic Reinforcement Learning — ThinkLLM