A function that assigns numerical scores to model outputs, guiding the learning process toward desired behaviors in reinforcement learning.