Rubric-Grounded RL: Structured Judge Rewards for Generalizable Reasoning — ThinkLLM