FASTER: Value-Guided Sampling for Fast RL — ThinkLLM