A mathematical operator that updates value estimates based on immediate rewards and future value predictions.