Stepwise credit assignment—rewarding each diffusion step for its own improvement rather than the final result—makes RL training of image generators more efficient and faster to converge.
This paper improves reinforcement learning for image generation models by assigning credit more intelligently across diffusion steps. Instead of treating all steps equally, it recognizes that early steps handle composition while late steps refine details, then rewards each step based on its specific contribution. This leads to faster learning and better sample efficiency.