ToolCUA: Towards Optimal GUI-Tool Path Orchestration for Computer Use Agents

Xuhao Hu, Xi Zhang, Haiyang Xu, Kyle Qiao, Jingyi Yang et al.|May 12, 2026arXiv

Key Takeaway

Agents perform better when trained to decide dynamically between GUI actions and tool calls rather than using only one approach—this hybrid strategy improved accuracy by 66% on real-world tasks.

Summary

ToolCUA trains computer agents to intelligently choose between GUI actions (clicks, typing) and tool calls (APIs) by synthesizing diverse training trajectories from existing data and using reinforcement learning to optimize when to switch between action types. This solves a key problem for digital agents: knowing when to use high-level tools versus low-level GUI interactions.

agents training reasoning

Key Terms

agentic-tasks tool-use reinforcement-learning-from-verifiable-rewards trajectory-synthesis supervised-fine-tuning