Operation-Guided Progressive Human-to-AI Text Transformation Benchmark for Multi-Granularity AI-Text Detection

Sondos Mahmoud Bsharat, Jiacheng Liu, Xiaohan Zhao, Tianjun Yao, Xinyi Shang et al.|June 4, 2026arXiv

Key Takeaway

AI-text detection isn't just about how much AI content is present—it depends on what edits were made, the domain, and revision history. Mixed-authorship documents can be harder to detect than fully AI-generated ones, exposing blind spots in current detection methods.

Summary

This paper introduces OpAI-Bench, a benchmark for detecting AI-generated text in documents that have been progressively edited by both humans and AI. Unlike existing benchmarks that only look at final outputs, OpAI-Bench tracks how AI authorship signals change across multiple revision stages, edit types, and document granularities (document, sentence, token, and span levels).

evaluation safety data

Key Terms

ai-text-detection benchmark authorship-attribution mixed-authorship granularity