The ability to solve problems by integrating information from multiple input types like images and text.