Image generation where the model actively infers implicit user intents from text descriptions rather than literal interpretation.