A measure of uncertainty in predicted tokens given context; low entropy signals memorization, high entropy signals generalization.