SPECTRA: Synthetic IR Test Collections with Relevance Oracles and Controlled Distractor Diagnostics

Eric Liang|May 29, 2026arXiv

Key Takeaway

You can now generate large, reproducible test collections with known relevance answers to diagnose IR system failures at scale—useful for stress-testing before building expensive human-judged benchmarks.

Summary

SPECTRA is a framework for generating synthetic text collections and test datasets for information retrieval systems. Instead of relying on expensive human-labeled data, it creates controllable document corpora with built-in relevance labels, letting researchers test how search systems handle scale, latency, and ranking challenges before investing in real human evaluation.

evaluation data

Key Terms

synthetic-data relevance-oracle cranfield-evaluation long-tail-vocabulary graded-relevance