Tool Attention Is All You Need: Dynamic Tool Gating and Lazy Schema Loading for Eliminating the MCP/Tools Tax in Scalable Agentic Workflows

Anuj Sadani, Deepak Kumar|April 23, 2026arXiv

Key Takeaway

Tool schema injection is a hidden operational cost in agent systems—Tool Attention solves this by filtering irrelevant tools and deferring full schema loading, reducing per-turn tokens from ~47k to ~2.4k without sacrificing capability.

Summary

This paper introduces Tool Attention, a middleware system that dramatically reduces the token overhead from injecting tool schemas into LLM agents. By using smart filtering (based on task intent and access rules) and lazy loading of full schemas only when needed, it cuts tool-related tokens by 95% in multi-tool deployments, making agentic workflows more efficient and cost-effective.

agents efficiency architecture

Key Terms

tool-use schema-context lazy-loading kv-cache gating-mechanism