DIRECT: When and Where Should You Allocate Test-Time Compute in Embodied Planners?

Jadelynn Dao, Milan Ganai, Yasmina Abukhadra, Ajay Sridhar, Mozhgan Nasr Azadani et al.|June 10, 2026arXiv

Key Takeaway

Test-time compute scaling helps embodied AI, but different scaling strategies (reasoning depth, model size, memory) have different costs and benefits—smart routing based on scene context can match stronger models at 65% lower latency.

Summary

This paper introduces DIRECT, a routing framework that intelligently allocates test-time compute for vision-language models used as robot planners. Instead of uniformly scaling compute (which increases latency and cost), DIRECT analyzes scene context to decide when to use chain-of-thought reasoning, larger models, or extended memory—achieving better performance per dollar spent on real robots.

agents efficiency reasoning

Key Terms

test-time-compute chain-of-thought vision-language-model embodied-agent routing