ParetoSlider: Diffusion Models Post-Training for Continuous Reward Control — ThinkLLM