Learning Who Disagrees: Demographic Importance Weighting for Modeling Annotator Distributions with DiADEM

Samay U. Shetty, Tharindu Cyril Weerasooriya, Deepak Pandita, Christopher M. Homan|April 9, 2026arXiv

Key Takeaway

Modeling annotator demographics explicitly—not just their labels—is crucial for NLP systems handling subjective tasks. DiADEM shows that race and age consistently predict disagreement patterns better than treating all annotators as interchangeable.

Summary

When people label subjective content like offensive speech, they disagree—and that disagreement matters. This paper introduces DiADEM, a neural model that learns which demographic factors (race, age, etc.) drive annotator disagreement, rather than flattening diverse perspectives into a single label. DiADEM outperforms LLMs and standard models at predicting who will disagree and why.

evaluation data alignment

Key Terms

annotator-disagreement demographic-importance-weighting perspectivist-evaluation