See it to Place it: Evolving Macro Placements with Vision-Language Models

Ikechukwu Uchendu, Swati Goel, Karly Hou, Ebrahim Songhori, Kuang-Huei Lee et al.|March 30, 2026arXiv

Key Takeaway

Foundation models trained on visual reasoning can solve specialized engineering problems like chip design without fine-tuning, by framing physical constraints as spatial reasoning tasks.

Summary

This paper uses Vision-Language Models to improve chip floorplanning—arranging components on a chip to minimize wiring. The approach, called VeoPlace, treats the chip layout as a visual problem, letting a VLM suggest component placements without any training, then iteratively refines these suggestions. It outperforms existing machine learning methods by up to 32% on standard benchmarks.

applications reasoning multimodal

Key Terms

vision-language-model macro-placement floorplanning wirelength evolutionary-search