AI systems that can process multiple types of input (text, images, etc.) and actively interact with external tools and environments.