A compact multimodal model from IBM's Granite family that handles both text and image inputs. At 4 billion parameters, it sits in the lightweight tier, making it practical for resource-constrained deployments while still offering vision capabilities. Its open-weight Apache 2.0 license means weights are freely available for inspection, fine-tuning, and commercial use.