KnowU-Bench: Towards Interactive, Proactive, and Personalized Mobile Agent Evaluation

Tongbo Chen, Zhengxi Lu, Zhan Xu, Guocheng Shao, Shaohan Zhao et al.|April 9, 2026arXiv

Key Takeaway

Building trustworthy personal assistants requires more than good GUI navigation—agents must actively learn user preferences through dialogue and make smart decisions about when to intervene, which current models struggle with even at the frontier.

Summary

KnowU-Bench is a new benchmark for evaluating mobile agents that must learn user preferences through interaction and decide when to proactively help.

agents evaluation applications

Key Terms

agentic-multimodal-models preference-alignment multi-turn-dialogue llm-as-a-judge intervention-calibration