Determining how much each token in a response should be rewarded or penalized based on overall performance.