Direct cache-based synthesis enables LLM agents to efficiently combine parallel branches without redundant computation, making multi-agent workflows faster and more aligned with how modern systems actually work.
This paper introduces Parallel-Synthesis, a framework that lets LLM agents directly process cached outputs from multiple parallel worker branches instead of concatenating text. By working with KV caches directly, it reduces computation time by 2.5-11x while maintaining or improving performance across math, code, and reasoning tasks.