Semantic Browsing: Controllable Diversity for Image Generation

Sara Dorfman, Maya Vishnevsky, Omer Dahary, Or Patashnik, Daniel Cohen-Or|June 22, 2026arXiv

Key Takeaway

By generating diverse prompts rather than diverse images from one prompt, you can create navigable design spaces where each variation is semantically meaningful and user-understandable, rather than random visual differences.

Summary

This paper solves the diversity problem in text-to-image generation by shifting variation from the model's random sampling to the text prompt itself.

multimodal agents applications

Key Terms

text-to-image-generation vision-language-model agentic-workflow semantic-variation