A measure of difference between probability distributions over predicted tokens, used to align model outputs.