Claw-Anything: Benchmarking Always-On Personal Assistants with Broader Access to User's Digital World

Yusong Lin, Xinyuan Liang, Haiyang Wang, Qipeng Gu, Siqi Cheng et al.|May 25, 2026arXiv

Key Takeaway

Building truly useful AI assistants requires handling messy, interconnected real-world contexts—not isolated tasks—and current models fall far short of this challenge, but synthetic data generation can help close the gap.

Summary

Claw-Anything is a benchmark for testing AI agents as always-on personal assistants with access to a user's full digital world—including activity history, multiple services, and both GUI and CLI interfaces.

agents evaluation reasoning

Key Terms

agentic-systems always-on-personal-assistants multi-round-event-injection proactive-assistance contextual-reasoning