Are We Making Progress in Multimodal Domain Generalization? A Comprehensive Benchmark Study

Hao Dong, Hongzhao Li, Shupan Li, Muhammad Haris Khan, Eleni Chatzi et al.|May 7, 2026arXiv

Key Takeaway

Despite claims of progress, multimodal domain generalization methods show only marginal improvements over basic approaches when fairly compared—the field needs better methods and standardized evaluation to make real progress.

Summary

This paper creates MMDG-Bench, the first standardized benchmark for multimodal domain generalization across action recognition, fault diagnosis, and sentiment analysis. Testing 9 methods on 6 datasets with 7,402 trained models, it reveals that recent specialized methods barely beat simple baselines, no method works consistently across tasks, and all methods struggle with corrupted or missing data.

evaluation multimodal

Key Terms

domain-generalization multimodal-input benchmark out-of-distribution-detection corruption-robustness