Modeling annotator demographics explicitly—not just their labels—is crucial for NLP systems handling subjective tasks. DiADEM shows that race and age consistently predict disagreement patterns better than treating all annotators as interchangeable.
When people label subjective content like offensive speech, they disagree—and that disagreement matters. This paper introduces DiADEM, a neural model that learns which demographic factors (race, age, etc.) drive annotator disagreement, rather than flattening diverse perspectives into a single label. DiADEM outperforms LLMs and standard models at predicting who will disagree and why.