Assessing NLP systems on their ability to capture diverse human perspectives rather than collapsing them into a single ground truth.