Do Lexical and Contextual Coreference Resolution Systems Degrade Differently under Mention Noise? An Empirical Study on Scientific Software Mentions

Atilla Kaan Alkan, Felix Grezes, Jennifer Lynn Bartlett, Anna Kelbert, Kelly Lockhart et al.|April 2, 2026arXiv

Key Takeaway

When building coreference systems for software mentions, choose between lexical and contextual methods based on your upstream noise type and corpus size: embeddings handle boundary noise better and scale linearly, while string matching degrades more gracefully under substitution errors.

Summary

This paper compares two approaches for identifying when software names refer to the same project across documents: a simple string-matching method and an embedding-based approach. Testing on noisy data shows they fail in different ways—embeddings handle boundary errors better, while string matching handles substitution errors better—and embeddings scale more efficiently to large datasets.

evaluation data

Key Terms

coreference-resolution mention-noise embedding-based-matching fuzzy-string-matching