What Gets Unmasked First? Trajectory Analysis of Diffusion Models for Graph-to-Text Generation

Qing Wang, Jacob Devasier, Chengkai Li|May 29, 2026arXiv

Key Takeaway

Masked diffusion models decode differently than autoregressive LLMs—they uncover entities first, then relations—but supervised fine-tuning disrupts this natural strategy; a simple inference-time fix recovers significant quality gains.

Summary

This paper studies how masked diffusion language models generate text from graphs by analyzing the order tokens are unmasked. The authors discover these models naturally prioritize entities before relational words, but fine-tuning breaks this strategy by locking in sentence-ending tokens too early.

training reasoning evaluation

Key Terms

masked-diffusion-language-model graph-to-text-generation decoding-trajectory structural-token inference-time-modification